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Abstract 

Unmeasured eonfounding may undermine the validity of eausal inferenee with observational 
studies. Sensitivity analysis provides an attraetive way to partially eireumvent this issue by 
assessing the potential influenee of unmeasured eonfounding on the eausal eonelusions. How¬ 
ever, previous sensitivity analysis approaehes often make strong and untestable assumptions 
sueh as having a eonfounder that is binary, or having no interaetion between the effeets of 
the exposure and the eonfounder on the outeome, or having only one eonfounder. Without 
imposing any assumptions on the eonfounder or eonfounders, we derive a bounding faetor 
and a sharp inequality sueh that the sensitivity analysis parameters must satisfy the inequal¬ 
ity if an unmeasured eonfounder is to explain away the observed effeet estimate or reduee 
it to a partieular level. Our approaeh is easy to implement and involves only two sensitivity 
parameters. Surprisingly, our bounding faetor, whieh makes no simplifying assumptions, is 
no more eonservative than a number of previous sensitivity analysis teehniques that do make 
assumptions. Our new bounding faetor implies not only the traditional Cornfield eonditions 
that both the relative risk of the exposure on the eonfounder and that of the eonfounder on the 
outeome must satisfy, but also a high threshold that the maximum of these relative risks must 
satisfy. Furthermore, this new bounding faetor ean be viewed as a measure of the strength of 
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confounding between the exposure and the outcome induced by a confounder. 

Key Words: Bounding factor; Causality; Confounding; Cornfield condition; Observational 
study. 

1 Introduction 

Causal inference with observational studies is of great interest and importance in many scien¬ 
tific disciplines. Although unmeasured confounding between the exposure and the outcome 
may bias the estimation of the true causal effect, an approach often called “sensitivity anal¬ 
ysis” or “bias analysis” over a range of sensitivity parameters sometimes allows researchers 
to make causal inferences even without full control of the confounders of the relationship 
between the exposure and outcome. 

Sensitivity analysis plays a central role in assessing the influence of the unmeasured con¬ 
founding on the causal conclusions. However, many sensitivity analysis techniques often 
require additional untestable assumptions. For instance, some authors assume a single bi¬ 
nary confounder Researchers also often assume a homogeneity assumption that there 
is no interaction between the effects of the exposure and the confounder on the outcome 
Some sensitivity analysis techniques only allow one to assess how strong an unmeasured con¬ 
founder would have to be to completely explain away an effectbut do not allow one 
to assess what the effect estimate might be under weaker unmeasured confounding scenarios, 
i.e., do not allow one to do sensitivity analysis under alternative hypotheses. Performing sen¬ 
sitivity analysis under alternative hypotheses can be quite challenging due to more parameters 
needed in the sensitivity analysis. Cornfield et al.’s early work^ on sensitivity analysis for the 
cigarette smoking and lung cancer association, which helped initiate the entire field of sen- 
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sitivity analysis, in fact made all three simplifying assumptions: a single binary confounder, 
no interaction, and only sensitivity analysis for the null hypothesis of no causal effect. Al¬ 
though some sensitivity analysis results exist for general confounders^’^^, they are only easy 
to implement under some of the above simplifying assumptions. 

In this paper, we propose a new bounding factor and sensitivity analysis technique with¬ 
out any assumptions about the unmeasured confounder or confounders. None of the null 
hypothesis, a single binary confounder, or the no-interaction assumption is required for us¬ 
ing the bounding factor. Nonetheless, our new bounding factor, which makes no simplifying 
assumptions, is no more conservative than many previous sensitivity analysis techniques that 
do make assumptions and is furthermore easy to implement. Moreover, we show that the new 
bounding factor implies not only the classical Cornfield conditions ^ that both the relative risk 
of the exposure on the confounder and that of the confounder on the outcome must satisfy, 
but also a stronger condition that the maximum of these relative risks must satisfy. The new 
bounding factor can be viewed as a measure of the strength of confounding between the ex¬ 
posure and the outcome resulting from the confounder. We begin by considering outcomes 
which are binary and extend our results further to time-to-event and non-negative count or 
continuous outcomes. We consider both ratio and difference scales. 

The claim that our technique is “without assumptions” requires some clarification. As we 
will see below, we will, without any assumptions, be able to make statements of the form: 
“For an observed association to be due solely to unmeasured confounding, two sensitivity 
analysis parameters must satisfy [a specific inequality].” We will also, without assumptions, 
be able to make statements of the form: “For unmeasured confounding alone to be able to 
reduce an observed association [to a given level], two sensitivity analysis parameters must 
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satisfy [another specific inequality]We believe the ability to make statements of this form 
without imposing any specific structure on the nature of the unmeasured confounder or con- 
founders constitutes a major advance in the literature. 

However, if statements are made of the form, “If the sensitivity analysis parameter take 
[specified values], then such unmeasured confounding can reduce the observed estimate by 
no more than [a specific level],” then the specification of the sensitivity analysis parame¬ 
ters could itself of course be viewed as an assumption. Moreover, when placing the results 
within a counterfactual or potential outcomes framework, the assumptions implicit within 
that framework of course would be needed also to give a potential outcomes interpretation to 
the sensitivity analysis. Thus certain types of statements concerning the sensitivity of con¬ 
clusions to unmeasured confounding can be made “without assumptions,” while other types 
of statements do require assumptions concerning the specification of the sensitivity analysis 
parameters themselves, or those implicit within the potential outcomes framework. 

Our title perhaps merits one further qualification which is that what is called in this pa¬ 
per “sensitivity analysis” is generally now referred to as “bias analysis” in the epidemiologic 
literature. Moreover, such “bias analysis” is relevant not only to problems of unmeasured 
confounding but also measurement error and selection bias, and our focus in this paper only 
concerns unmeasured confounding. The term “sensitivity analysis” is, however, still em¬ 
ployed in statistics, econometrics, and in many of the social sciences for issues of unmea¬ 
sured confounding. We believe the technique presented in this paper will be useful across 
this range of disciplines and have chosen to use the broader term, while acknowledging that 
terminology in epidemiology has shifted. 
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2 Main Result: A New Bounding Factor 


Let E denote the exposure, D denote a binary outcome, C denote the measured confounders, 
and U denote one or more unmeasured confounders. We will assume for what follows that 
the exposure E is binary, but all of the results below are also applicable to a categorical 
or continuous exposure and could be applied comparing any two levels of E. For ease of 
notation, we assume that the unmeasured confounder U is categorical with levels 0,1,...,^— 
1. But all the conclusions hold for U of general type (categorical, continuous, or mixed; single 
or multiple confounders). We provide proofs and theoretical technical details for general U 
in the eAppendix. 

Let = P(D = 1 I £■ = 1,C = c)/P(D = 1 | £ = 0,C = c) denote the observed 

relative risk of the exposure E on the outcome D within stratum of measured confounders 
C = c. Define RR^t/^^ic = P(t^ = k \ E = l^C = c) /P(t/ = k \ E = O^C = c) as the relative 
risk of exposure on category k of the unmeasured confounder within stratum of measured 
confounders C = c. We use RR^f/i^ = max/; RRgf/^/tlc to denote the maximum of these relative 
risks between E and U, which we will call the maximal relative risk of £ on t/ within stratum 
C = c. Define 

max/. P(D =1 I £ = 0, C = c, [/ = k) 

UD\E=Q,c - inin^p(£)= 1 I £ = o,C = c,[/ = k) 

as the maximum of the effect ofU on D among the unexposed comparing any two categories 

of U (i.e., the ratio of the maximum and minimum of the probabilities of the outcome over 

strata of U without exposure and within stratum C = c); similarly, define 

_ max/.P(D= 1 |£ = l,C = c,f/ = k) 

UD\E=i,c- inin^p(z)= 1 \ E = l,C = c,U = k) 

as the maximum of the effect of t/ on D among the exposed comparing any two categories of 
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U (i.e., the ratio of the maximum and minimum of the probabilities of the outcome over strata 
of U with exposure and within stratum C = c). We use RR(/£)|c = max(RR[/o|£=i RRf 7 D|£=o,c) 
to denote the maximum of the relative risks between U and D with and without exposure, de¬ 
fined as the maximal relative risk of t/ on D within stratum C = c. Note that if t/ is a vector 
that contains multiple unmeasured confounders, then RR^j/j^ and RR[/£)|c are defined as the 
maximum relative risk comparing any two categories of the vector U. 

If C and U suffice to control for confounding for the effect of E on D, the standardized 
relative risk 

^ li:^V{D^l\E^UC = c,U = mu = k\C = c) 

1 \E^0,C = c,U = k)f{V = k\C = c) 

is the true causal relative risk of the exposure E on the outcome D within stratum C = c. 

In the main text, we focus the discussion on the whole population. We further show in the 
eAppendix that all the conclusions also hold for exposed and unexposed subpopulations. 

We will for the next several sections assume all analyses are carried out within strata of C, 
and thus the condition C = c is omitted and kept implicitly in all the conditional probabilities, 
e.g., RRg'^l^ is replaced by RR^^ for notational simplicity. Later in the paper we will com¬ 
ment on how the results are applicable to estimation averaged over C, rather than conditional 
on C. 

The relative risk pair (RR^^y, RR{//)) measures the strength of confounding between the 
exposure E and the outcome D induced by the confounder U. Our main result ties the ratio of 
the observed relative risk RR^o adjusted only for measured confounders C and the true rela¬ 
tive risk RR^'^ adjusted also for unmeasured confounders U, to the strength of confounding, 
(RR^t/, RR{/£)). Without any assumptions, we have the following result: 
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Result 1. 


RR 


true 

ED 


> RR°*’® / RRgf/ X RRt/p 
- W RR£(/ + RRt/z)-r 


( 1 ) 


Result 1 shows that even in the presence of unmeasured confounding the true relative risk 
must be at least as large as RR|^ j i • Iii the eAppendix, we provide a proof for 

result (1) and also show that the inequality is sharp in the sense that we can always construct 
a model with a confounder IJ to attain the equality. The quantity (RR^j/ x RR^/d) /(RR^f/ + 
RR{ 7 £) — 1) is a new joint bounding factor for the relative risk. Although quite simple, this 
bound using both RR^;; and RRt/o has several important implications. 

First, the result essentially allows for sensitivity analysis without assumptions insofar as 
for an unmeasured confounder to reduce an observed estimated RR^o actual relative 
risk of RR^ the sensitivity analysis parameters RR^t; and RR(/£) must be sufficiently large 
to satisfy the inequality 

RReu X RRup ^ RRgg 
RReu + ^^ud-1 ~ 

This statement holds without any assumptions about the nature of the unmeasured con¬ 
founder. One could plot those values of RR^f/ and RRf//? that would be required to explain 
away the effect estimate (or the lower limit of a confidence interval). To conduct sensitiv¬ 
ity analysis with pre-specified strength of the unmeasured confounder, (RR^t/, RRf//)), we 
can divide the observed relative risk and its confidence limits by (RR^j/ x RRj//))/(RR^j/ -|- 
RR(/£) — 1), in order to obtain a point estimate and confidence limits of the lower bound of the 
true causal effect of the exposure E on the outcome D. We will refer to the relative risk ad¬ 
justed only for C, when divided by the bounding factor (RR^j/ x RRt/o) /(RR^jy -|- RR(/£) — 
1) as the corrected relative risk. It is “corrected” in the sense that an unmeasured confounder 
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cannot reduce the relative risk any further than what is obtained by division by its bounding 
factor. As an example, suppose we have an observed relative risk of 2.1 with a 95% confi¬ 
dence interval [1.4,3.1]. If we consider an unmeasured confounder with (RR^j/, RRfyo) = 
(2,2), then the joint bounding factor is2x2/(2-t-2—1) = 1.33, and the corrected relative 
risk is 2.1/1.33 = 1.58 with a 95% confidence interval [1.4/1.33,3.1/1.33] = [1.05,2.33]. 
Therefore, the confounder with (RR£'{ 7 ,RR(//)) = (2,2) cannot explain away the observed 
relative risk 2.1 or its lower confidence limit 1.4, i.e., it cannot reduce the point estimate 
and lower confidence limit of the relative risk to be smaller than one. If we consider an 
unmeasured confounder with (RR^^y, RRf//)) = (2.5,3.5), then the joint bounding factor is 
2.5 X 3.5/(2.5 -|- 3.5 — I) = 1.75, and an estimate for the lower bound of the true causal rela¬ 
tive risk is 2.1/1.75 = 1.20 with a 95% confidence interval [1.4/1.75,3.1/1.75] = [0.8,1.77]. 
Although the confounder with (RR^y/, RRf//)) = (2.5,3.5) cannot explain away the observed 
relative risk of 2.1, it reduces the original lower confidence limit 1.4 to 0.8 (i.e., less than 
one). Note that we are not merely assessing a binary confounder, and we are not imposing 
the no interaction assumption. Moreover, we are not restricted to only assessing how much 
confounding can explain away an effect, nor are we even assuming that there is a single un¬ 
measured confounder (since U can be a vector of unmeasured confounders). The corrected 
estimates and confidence intervals above are applicable irrespective of the underlying con¬ 
founder (or confounders). We can apply the technique to obtain a range of values for the true 
causal effect under different specifications of RR^t; and RRfy/). 

Table 1 shows the magnitudes of the joint bounding factor for different combinations 
of RR £{7 and RRuo. The entries in the table for the joint bounding factor are the largest 
observed relative risks that such an unmeasured confounder could explain away. We can see 
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from the table that the joint bounding factor is always smaller than both of RR^;/ and RR^//), 


and much smaller than the maximum of them. 


Table 1: Magnitudes of the joint bounding factor for different combinations of RReu and 
RRt/fl 

RR[/d 


bounding factor 

1.3 

1.5 

1.8 

2 

2.5 

3 

3.5 

4 

5 

6 

8 

10 

1.3 

1.06 

1.08 

1.11 

1.13 

1.16 

1.18 

1.20 

1.21 

1.23 

1.24 

1.25 

1.26 

1.5 

1.08 

1.12 

1.17 

1.20 

1.25 

1.29 

1.31 

1.33 

1.36 

1.38 

1.41 

1.43 

1.8 

1.11 

1.17 

1.25 

1.29 

1.36 

1.42 

1.47 

1.50 

1.55 

1.59 

1.64 

1.67 

2 

1.13 

1.20 

1.29 

1.33 

1.43 

1.50 

1.56 

1.60 

1.67 

1.71 

1.78 

1.82 

2.5 

1.16 

1.25 

1.36 

1.43 

1.56 

1.67 

1.75 

1.82 

1.92 

2.00 

2.11 

2.17 

RRee 3] 

1.18 

1.29 

1.42 

1.50 

1.67 

1.80 

1.91 

2.00 

2.14 

2.25 

2.40 

2.50 

1.20 

1.31 

1.47 

1.56 

1.75 

1.91 

2.04 

2.15 

2.33 

2.47 

2.67 

2.80 

4 

1.21 

1.33 

1.50 

1.60 

1.82 

2.00 

2.15 

2.29 

2.50 

2.67 

2.91 

3.08 

5 

1.23 

1.36 

1.55 

1.67 

1.92 

2.14 

2.33 

2.50 

2.78 

3.00 

3.33 

3.57 

6 

1.24 

1.38 

1.59 

1.71 

2.00 

2.25 

2.47 

2.67 

3.00 

3.27 

3.69 

4.00 

8 

1.25 

1.41 

1.64 

1.78 

2.11 

2.40 

2.67 

2.91 

3.33 

3.69 

4.27 

4.71 

10 

1.26 

1.43 

1.67 

1.82 

2.17 

2.50 

2.80 

3.08 

3.57 

4.00 

4.71 

5.26 


As a second important consequence of our main result in (1), we also show in the eAp- 
pendix that once we specify one of the unmeasured confounding measures, for example 
RR£(/, then to be able to reduce an observed relative risk of RR^o to a true causal rela¬ 
tive risk of RR^ the other confounding measure RRf/z) must be at least of the magnitude 

RRist; X RR* - RR*j 

~ RRek X RR“ - RR*' 

For an unmeasured confounder to completely explain away the relative risk, i.e., reduce 
RR^'j^ to RR^ = 1, once we specify RR^t/ the other unmeasured confounding measure 
much be at least of the magnitude 


RRi/z) > 


RREuxRRf^-RRfl 

RREu-RRf^ 


For example, if we have an observed relative risk RR^^ = 2.5, and we specify the exposure- 
confounder association RR £)7 = 3. Then in order to reduce the observed relative risk to a true 
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causal relative risk RR^ = 1.5, the confounder-outcome association must be at least as large 
as (3 X 2.5 — 2.5)/(3 x 1.5 — 2.5) = 2.5; in order to completely explain away the observed 
relative risk (i.e., to reduce the observed relative risk to RR^'d = 1), the confounder-outcome 
association must be at least as large as (3 x 2.5 — 2.5)/(3 — 2.5) = 10. The symmetry of result 
(1) implies that a similar result also holds for RR^j/ with pre-specified RRud- 

Third, we show in the eAppendix that if both the generalized relative risks RR^f/ and 
RR{ 7 £) have the same magnitude, for an unmeasured confounder to reduce an observed rel¬ 
ative risk of RR^^ to a true causal relative risk of RR^'p both of the confounding relative 
risks must thus be at least as large as 

RReu = RRi/o > |rR£d + VrORrIT^R^} / RRfj^. 

For an unmeasured confounder to completely explain away an observed relative risk of RR^^ 
(i.e., to reduce RR^o ^ true causal relative risk of RR^ = 1), both RR^t/ and RRjyo must 
be at least as large as 

RReu = RRf/c > RR^d + 

If one of the confounding relative risks is smaller than the lower bound above, we then know 
that the other one must be larger. Thus even if RR^t; and RRt/o are not of the same magni¬ 
tude, the maximum of RR^t/ and RRi/d must satisfy the inequality above. We then have the 
following “high threshold” condition: 

max{RREu,RRuD) > IrR^^^T VRrIKRR^^^R^} /rRed- 

For example, in order to reduce an observed relative risk of RR^o = 2.5 to a true causal 
relative risk of RR^ = 1-5, the high threshold is (2.5 -|- y/2.5 x 1)/1.5 = 2.72; at least one 
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of RR^t/ and RR(/o must be of magnitude 2.72 or above. In order to completely explain 


away an observed relative risk of RR^^ = 2.5, the high threshold is 2.5 + V2.5 x 1.5 = 4.44; 
at least one of RR^t/ and RR[/z> must be of magnitude 4.44 or higher to completely explain 
away the effect. 

Fourth, the bias formula in (1) is relevant for an apparently causative exposure, which 
allows researchers to get lower bounds of the true causal relative risk given pre-specified 
sensitivity parameters RR^f/ and RRfyzj. If the exposure E is apparently preventive with 
RR^'j^ < 1, we can use the following formula to conduct sensitivity analysis: 


RR 


true 

ED 


< RR^'g X 


RRfit/ X RRud 
RRff/ + RRud — 1 ’ 


( 2 ) 


where we modify the definition of RR^t/ as max^RR^^ i.e., the maximum of the inverse 
relative risks relating E and U, or equivalently the inverse of the minimum of the relative risks 
relating E and U. For an apparently preventive exposure, (2) allows researchers to obtain an 
upper bound of the causal relative risk RR^ by multiplying the observed relative risk RR^^ 
by the joint bounding factor RR^;/ x RR(/£)/(RR^j/ + RRud — !)• We present the proof in 
the eAppendix, and omit analogous discussion based on (2). 

Finally, all the results above are within strata of the observed covariates C as would be 
obtained from a log-binomial regression model or a logistic regression model with rare out¬ 
come. If averaged relative risk over the observed covariates C is of interest, the true causal 
relative risk must be at least as large as the minimum of RR^nu / over c. 

If we assume a common causal relative risk among the levels of C as in the usual log-linear 
or logistic regression with rare outcomes, then the true causal relative risk must be at least 
as large as the maximum of RR^'^l^y^ over c. See the eAppendix for further 
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discussion. 


3 Relation with Cornfield Conditions 

Under the assumptions of a binary eonfounder U and the eonditional independenee between 
the exposure E and the outeome D given the eonfounder U, Cornfield et al. ^ showed that the 
exposure-eonfounder relative risk must be at least as large as the observed exposure-outeome 
relative risk: 

RReu > RRffl. (3) 

Sehlesselman^ further showed that the eonfounder-outeome relative risk must also be at least 
as large as the observed exposure-outeome relative risk: 

RRud > RR^o- (4) 

We show in the eAppendix that the elassieal Cornfield eonditions (3) and (4) are just speeial 
eases of our result by letting one of RR^t/ or RRjjd go to infinity in (1). Moreover, our 
results apply to general eonfounders not just binary eonfounders, and our results also apply 
to other possible values of the true eausal relative risk of the exposure on the outeome. We 
are not restrieted to only assessing how strong the unmeasured eonfounder would have to be 
to eompletely explain away the effeet. Thus, for example, for eonfounding to reduee the ob¬ 
served relative risk RR^j^ to a true eausal relative risk of RR^, the unmeasured eonfounding 
measures have to satisfy 

RReu > RRfl/RRTo and RRf;,, > RRf^/RRf^. (5) 

Perhaps even more importantly with regard to Cornfield-like eonditions, our main result in 
(1) not only leads to the eonditions in (5) that both RR^t/ and RR(/o must satisfy, but also 
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implies the following condition that the maximum of RR^t/ and RRj//) must satisfy: 


max(RR£t/,RRf;z)) > |RR^'g + y^RORR^^^R^| /RR^^ (6) 

to reduce an observed relative risk RR^^ to a true causal relative risk RR^'^. We show this 
in the eAppendix. As a special case, for the unmeasured confounder to completely explain 
away the observed relative risk (i.e., RR^ = 1), it is necessary that 

max(RR£t/,RR(/z)) > RRl'g + y^R^RR^^. 

Once again the results do not require a binary unmeasured confounder. They are applicable 
to any unmeasured confounder. Similar low and high threshold Cornfield conditions that the 
minimum and maximum of the confounding measures must satisfy to completely explain 
away an effect were derived on an odds ratio scale of exposure-confounder association by 
Flanders and Khoury^^ and Lee^®, and we comment and extend these results in the eAp¬ 
pendix. 

The classical Cornfield conditions and the high threshold generalization are useful to 
answer the question about the magnitude of the association between the exposure and the 
confounder and that between the confounder and the outcome, in order to explain away the 
observed exposure-outcome association or with our new results, to reduce it to a pre-specified 
magnitude. The Cornfield conditions in (5) and (6) are especially useful, when we want to 
specify only one of the marginal associations RR^t/ or RRfy/) as well as their relative magni¬ 
tudes. However, they are inferior to the main result in (1), which is essentially the condition 
that the joint values of (RR^t/jRRf//)) must satisfy. As will be seen below, although the 
high threshold conclusions are a useful heuristic, they are weaker than the use of our new 
joint bounding factor in (1) insofar as there are scenarios which the joint bounding factor in 
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(1) can rule out an estimate as being due to unmeasured confounding but the high threshold 
conditions cannot. For example, when we have an observed exposure-outcome relative risk 
of ~ 3, the low threshold (i.e., the classical Cornfield condition) is given by 

min(RR£j7,RR(7£)) > 3, 

the high threshold is given by 

max(RR£'(/, RR(/£)) > 3 -|- v^3 x 2 = 5.45, 

and the joint threshold condition is given by 

RRff/ X RRt/z) ^ ^ 

RRfit/ + RRf/£) — 1 ~ 

Thus, the low Cornfield threshold is 3, and so we know that we must have that both RR^j; 
and RR{ 7 £) be greater than 3 to explain away the effect. The high Cornfield threshold is 5.45, 
and so at least one of RRgf/ and RRjy/j must be larger than 5.45 to explain away the effect. 
Consider an unmeasured confounder with (RR^jy = 5.5 ,RR[/d = 3.1), they would exceed 
both the low Cornfield threshold (since RReu >3, RR[/£) > 3) and the high threshold (since 
RReu > 5.45), and we might thus think it can explain away the observed exposure-outcome 
relative risk. However, using our joint threshold condition in (1), an unmeasured confounder 
with (RR^t; = 5.5,RRt/D = 3.1) has abounding factor 5.5 x 3.1/(5.5-l-3.1 — 1) = 2.24 < 3 
and thus such confounding could not explain away an observed relative risk of 3. We can 
see this from our result in (1), but we cannot see this from the classical Cornfield conditions 
and even the new high threshold Cornfield condition. The Cornfield conditions, both low 
and high thresholds, although a useful heuristic, are not as useful for sensitivity analysis as 
our bounding factor in (1) insofar as there as scenarios, such as the one above, which our 
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bounding factor in (1) can rule out an estimate as being due to unmeasured confounding but 
the low and high threshold Cornfield conditions cannot. 

4 Illustration 

Consider the historical study conducted by Hammond and Horn^^, in which the point esti¬ 
mate of the observed relative risk of cigarette smoking on lung cancer was = 10.73 

with 95% confidence interval [8.02,14.36]. Fisher suggested that the observed relative risk 
of the exposure E on the outcome D might be completely due to the existence of a common 
genetic confounder. The work of Cornfield et al. ^ showed that for a binary unmeasured con- 
founder to completely explain away the observed relative risk, both the exposure-confounder 
relative risk and the confounder-outcome relative risk would have to be at least 10.73. Let us 
now assume then that both the exposure-confounder relative risk and the confounder-outcome 
relative risk have the magnitude 10.73. The joint bounding factor is 

RR^t/ X RRud _ 10.73 x 10.73 _ c 

RReu + RRud-1 ~ 10.73 + 10.73-1 ^ 

Even if we assume such a strong confounder, the point estimate of the causal relative risk of 
cigarette smoking and lung cancer must still be at least as large as RR^ > RR^o/5-63 = 
10.73/5.63 = 1.91 > 1, and the 95% confidence interval is [8.02/5.63,14.36/5.63] = [1.42,2.55] 
with the lower confidence limit still larger than one. Thus in fact, not even exposure-confounder 
and confounder-outcome relative risks of 10.73 suffice to explain away the effect nor the 
lower confidence limit. In fact, in order to explain away the point estimate of the observed 
relative risk 10.73, the magnitude of RR^fy and RRj/z) (if RReu = RR(/o) should be at 
least as large as 10.73 -I- ^/10.73 x 9.73 = 20.95. And in order to explain away the lower 
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confidence limit 8.02, these two confounding relative risks should be at least as large as 
8.02 + V8.02 X 7.02 = 15.52. More generally, we can plot those values of RR^i/ and RRud 
that would be required to explain away the effect estimate or the lower limit of the confidence 
interval. This is given in Figure 1. To explain away the point estimate the two parameters 
would have to lie on or above the solid line. To explain away the lower confidence limit the 
two parameters would have to lie on or above the dotted line. These results hold without any 
assumptions on the structure of the unmeasured confounding. The numerical results above 
show that by using the new joint bounding factor it is even more implausible than using 
the Cornfield conditions that a genetic confounder explains away the relative risk between 
cigarette smoking and lung cancer. 

More generally, we could consider corrected estimates and confidence intervals for the 
effect over a range of different values of the sensitivity analysis parameters, RR^t/ and RRjyz), 
as in Table 2. The columns of Table 2 correspond to RRf/z) and the rows to RR^t/. The 
entries are the corrected estimates and confidence intervals for the effect under the different 
confounding scenarios. In general a table like this one is most informative for sensitivity 
analysis. SAS code to carry out such a sensitivity analysis and to provide such a table is 
given in Appendix 1. 
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Figure 1: The areas above the two lines are the joint values of (RReu,RRud) that can would 
be required to explain away the effect estimate 10.73 and the lower confidence limit 8.02. 




Table 2: Bounds on corrected estimates, lower confidence limits, and upper confidence limits 
for unmeasured confounding (each cell contains bounds on point estimate, lower and upper 
confidence limits; columns correspond to increasing strength of the risk ratio of U on the 
outcome; rows correspond to increasing strength of risk ratio relating the exposure and U) 
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(2.89, 

5.17) 

3.22 

(2.41, 

4.31) 

3.00 

(2.25, 

4.02) 

8.0 

9.17 

(6.85, 

12.27) 

8.56 

(6.40, 

11.46) 

7.60 

(5.68, 

10.17) 

6.56 
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(2.41, 
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(4.81, 

8.62) 

5.90 

(4.41, 

7.90) 

4.94 

(3.69, 

6.61) 

4.29 

(3.21, 

5.74) 

3.00 

(2.25, 

4.02) 

2.28 

(1.70, 

3.05) 

2.04 

(1.52, 

2.73) 
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5 Discussion 


A crucial task in causal inference with observational studies is to assess the sensitivity of 
causal conclusions with respect to unmeasured confounding. In sensitivity analysis, because 
one is assessing the sensitivity of conclusions to the assumption of no unmeasured confound¬ 
ing, additional untestable assumptions may often seem undesirable and suspect to researchers. 
We have introduced a new joint bounding factor that allows researchers to conduct sensitiv¬ 
ity analysis without assumptions, i.e., we provide an inequality, that is applicable without 
any assumptions, such that the sensitivity analysis parameters must satisfy the inequality if 
an unmeasured confounder is to explain away the observed effect estimate or reduce it to a 
particular level. We can obtain a conservative estimate of the true causal effect by dividing 
the observed relative risk by the bounding factor; the method does not assume a single binary 
confounder or no exposure-confounder interaction on the outcome. 

Previous sensitivity analysis approaches in the literature often relied on the assumption 
of a single binary confounder and no-interaction between the effects of the exposure and the 
confounder on the outcome^’^’^. For example, Schlesselman^ assumed a binary confounder, 
a common relative risk, 7 , of the confounder on the outcome for both with and without expo¬ 
sure, i.e., a no interaction assumption. Under these assumptions, he obtained the bias factor 
RRf^/RRf^ = {1 + (r- l)P(t/ = l\ E = 1)}/{1 + ( 7 - l)P(t/ = 1 I £ = 0)} for sensi¬ 
tivity analysis requiring specifications of 7 ,P(t/ = 1 | £ = 1) and P(t/ = 1 | £ = 0). Our 
result requires fewer assumptions and fewer sensitivity parameters (two rather than three). 
We further discuss in the eAppendix that, under Schlesselman’s formula, if P(t/ = \\E = 
1) /P{U = 1 I £ = 0) is constrained to be no larger than some limit RReu, then the maximum 
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bias factor that can be obtained from Schlesselman’s formula is RReu x 7/ (RReu + 7~ 1)> 
which is the same as our bounding factor. Thus, in this setting Schlesselman’s no interaction 
assumption does not strengthen the bounds; the no interaction assumption is unnecessary. 
Without the no interaction assumption, Flanders and Khoury and VanderWeele and Arah^, 
derived general formulas for sensitivity analysis. However, unless the confounder is binary, 
these formulas require specifying a very large number of parameters. They also require spec¬ 
ifying the prevalence of each confounder level. Flanders and Khoury derive bounds for the 
true causal relative risk for the exposed population which are potentially applicable without 
specifying the prevalence of the unmeasured confounder. However, without specifying the 
prevalence, their formula only leads to a low threshold Cornfield condition, and these bounds 
are thus much weaker than those in this paper. We discuss further the relation between their 
results and ours in the eAppendix. 

The relative risk scale is widely used for sensitivity analysis in epidemiology and else¬ 
where, but the risk difference scale is also often of interest and importance We show, 
in the Appendix, that similar conditions for sensitivity analysis also hold for the risk differ¬ 
ence. If we use similar sensitivity parameters on the relative risk scale for the risk difference 
estimate, then we can derive similar lower bounds on the effects and determine how much 
confounding is required to explain away an effect or reduce it to a specific level. See Ap¬ 
pendix 2 for details. SAS code for this approach is also given in the eAppendix. We can 
also do sensitivity analysis for the risk difference using sensitivity parameters on the risk 
dijference scale. Unfortunately, however, these conditions for the risk difference using risk 
difference sensitivity parameters then depend on the number of categories of the unmeasured 
confounder, and become weaker for confounders with more categories. This is not the case 
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for sensitivity analysis of the risk difference (or the relative risk) if the sensitivity parame¬ 
ters themselves are expressed on the relative risk scale, in which case the bounding factor is 
applicable and is the same regardless of the number of categories. Due to this property, it is 
perhaps more suitable to conduct sensitivity analysis for the risk difference using sensitivity 
parameters on the relative risk scale. See Appendix 3 for further discussion. 

The hazard ratio is widely used for analyzing data with time-to-event outcome. In the 
eAppendix, we show that under the assumption of having a rare outcome at the end of follow¬ 
up, the same bounding factor also applies to the hazard ratio with the confounder-outcome 
relative risk replaced by the confounder-outcome hazard ratio. Likewise similar results also 
apply to non-negative outcomes (e.g., counts or positive continuous outcomes) by replacing 
the confounder-outcome relative risk by the maximum ratio by which the confounder may 
increase the expected outcome comparing any two confounder categories. 

The new joint bounding factor (RR^f/ x RR(/£)) /(RR^y -|-RRt/D — 1) plays a central role 
in our sensitivity analysis approach, which, in turn, gives us a new measure of the strength 
of unmeasured confounding induced by a confounder U. Our approach has the advantage of 
making no assumptions about the structure of the unmeasured confounder or confounders, 
and of delivering conclusions much stronger than the original Cornfield conditions. 

In general, a table with many different possible sensitivity analysis parameters including 
values that are quite extreme, such as Table 2 above, will be most informative. However, at 
the very least, in any observational study, researchers should report how much confounding 
would be needed to reduce the estimate, and how much confounding would be needed to 
reduce the confidence interval, to include the null. We believe that if this were always done 
in observational studies, the evidence for causality could much more easily be assessed and 
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science would be better served. 


Appendix 1: SAS Code 


The SAS code for the cigarette smoking and lung cancer example in Table 2 is given below. 
A researcher could modify the code for use in other examples by just changing the first 
few lines of code with the estimated observed relative controlling for only the measured 
covarates (RR=), and the lower and upper confidence interval for this estimate(RR_Lower=, 
RR_Upper=). The minimum and maximum strength of the unmeasured confounder can also 
be modified by adjusting the lines with “RR EU=” and “RR UD=” but we recommend always 
including at least some relatively large values, e.g., with RR^jy and RRy/z) at least as high as 
5 so as to get a sense as to how an estimate would change under fairly severe confounding. 


proc iml; 

/*the point estimator and confidence interval of RR*/ 

RR = 10.73; 

RR_Lower = 8,02; 

RR_Upper = 14.36; 

/*strenghth of confounding resulting from U*/ 

RR_EU = {1.2 1.3 1.5 1.8 2 2.5 3 4 5 6 8 10}; 

RR_UD = (1.2 1.3 1.5 1.8 2 2.5 3 4 5 6 8 10}; 
highthreshold = ROUND(RR + SQRT(RR*(RR-1)), 0.01); 
rownames_EU = CHAR{RR_EU, NCOL{RR_EU), 1); 
colnames_UD = CHAR{RR_UD, NCOL{RR_UD), 1); 

BiasFactor = J(NCOL{RR_EU), NCOL{RR_UD), 1); 

SPACE = J(NCOL{RR_EU), NCOL{RR_UD), " "); 

LeftP = J(NCOL(RR_EU), NCOL(RR_UD), "{"); 

Mid = J(NCOL{RR_EU), NCOL{RR_UD), 

RightP = J(NCOL(RR_EU), NCOL(RR_UD), ")"); 

RR_true = BiasFactor; 

RR_true_Lower = BiasFactor; 

RR_true_Upper = BiasFactor; 

RR_true_CI = BiasFactor; 

DO i=l TO NCOL{RR_EU); 

Do j=l to NCOL{RR_UD); 

BiasFactor[i, j] = RR_EU[i]*RR_UD[j]/{RR_EU[i] + RR_UD[j] - 1); 

RR_true[i, j] = ROUND(RR/BiasFactor[i, j], 0.01); 

RR_true_Lower[i, j] = ROUND(RR_Lower/BiasFactor[i, j], 0.01); 

RR_true_Upper[i, j] = ROUND(RR_Upper/BiasFactor[i, j], 0,01); 

END; 

END; 

RR_true_CI = CATX{" ", CHAR{RR_true), LeftP, CHAR(RR_true_Lower), Mid, CHAR(RR_true_Upper), RightP); 
print RR_true_CI[colname = colnames_UD 
rowname = rownames_EU 

label = "Bounds on corrected estimates and confidence intervals for unmeasured confounding 
(columns correspond to increasing strength of the risk ratio of U on the outcome; 
rows correspond to increasing strength of risk ratio relating the exposure and U)"]; 

run; 
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Appendix 2: Conditions for the Risk Difference Using Sensi¬ 
tivity Parameters on the Relative Risk Scale 

As in the text we assume analysis is conducted conditional on, or within strata of the measured 
covariates C. Define the bounding factor as BF(y = RR^t/ x RR(/D/(RR£j 7 + RRt/o — 1), 
the prevalence of the exposure as f = V{E = 1), and the probabilities of the outcome with 
and without exposure as p\ = P(D = 1 | £ = 1) and pQ = P(D = 1 | £ = 0). The causal risk 
differences for the exposed and unexposed populations are 

K-\ 

= pi-£ P(D= 1 |£ = 0,[/ = k)P(t/ = k|£ = 1), 

k=0 

K-l 

= £p(D = 1 |£ = l,t/ = k)P(t/ = k|£=0)-po, 

k=0 

and the causal risk difference for the whole population is 

K-l 

= £{P(D= 1 |£ = l,t/ = k)-P(D= 1 |£ = 0,t/ = k)}P(t/ = k) 

k=0 

= /RD^“z)% + (l-/)RDrz)-- 

We show in the eAppendix that the lower bounds for the causal risk differences are 

> Pi — po X BF(/, 

RDf^_>pi/BFu-po, 

RDf^>{pi-poxBFu)x{f+{l-f)/BFu} = {pi/BFu-po)y<{fxBFu + {l-f)}. 

Note that even without knowing /, we can use the inequality RD^ > min(RD£Q_(_, RD£'^_) 
to obtain a lower bound for RD^. 

As an example, suppose the probabilities of the outcome with and without exposure are 
Pi = 0.25,po = and therefore the observed risk difference is RD^*^ = Pi— Po = 0.15. If 
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we assume that the unmeasured confounding measures are (RR^t/jRRf//)) = (2,2) with the 
joint bounding factor of 2x2/(2 + 2—1) = 1.33, then the true risk difference for the exposed 
is at least as large as 0.25 — 0.1 x 1.33 = 0.12, the true risk difference for the unexposed is at 
least as large as 0.25/1.33 — 0.1 = 0.09, and the true risk difference for the whole population 
is at least as large as min(0.12,0.09) = 0.09. If we further know that the prevalence of the 
exposure is / = 0.2, the true risk difference for the whole population is at least as large as 
0.12x0.2 + 0.09x0.8 = 0.10. 

The above results imply that, for an unmeasured confounder to reduce the observed risk 
difference to be RD£'^_i_,RD£'^_ and RD^ respectively, the Cornfield conditions for the 
joint bounding factor for the exposed, the unexposed, and the whole population, respectively, 
are 


BF(;>(pi-RD^%)/po, 


BFu>Pi/ipo + RDf^^), 

gP > V{RD£o+;?o(i-/)-;?i/}^+4pipo/(i-/)-{RD^“+;?o(i-/)-;?!/} 


2po/ 


Note that if the true causal risk difference is RD^ = 0, the above conditions all reduce to 
BF(/ > RR£^. Suppose, again, the probabilities of the observed outcome with and without 
exposure are pi = 0.25,po = and the prevalence of the exposure is / = 0.2. For an 
unmeasured confounder U to reduce the observed risk difference of RD^^ = 0.15 to a true 
risk difference of RD^'^ = 0.05, the joint bounding factor resulting from the confounder must 
be at least as large as 

„„ . +(0.05+0. 1x0.8-0.25x0.2)2+4x0.25x0.1x0.2x0.8-(0.05+0. 1x0.8-0.25x0.2) ^ . 

+ - 2x0.1x0.2 - = 


Therefore, as in the text both of the confounding measures RR^t/ and RRud must be at least 
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as large as 1.74, and the maximum of them must be at least as large as 1.74 + \/\.lA x 0.74 = 

2 . 88 . 

The above results are useful for apparently causative exposures with > 0, which 

give (possibly positive) lower bounds for the causal risk differences. However, for apparently 
preventive exposure with RD^*^ < 0, we need to modify the definition of RRgjy as RR^t/ = 
max„RRg^(M). And we have the following analogous results on the upper bounds of the 
causal risk differences: 

< Pi X BF[/ — pqi 

RD^^®_<pi-po/BFt;, 

RD^^® < (pi X BF(/ - po) X {/ + (1 - /)/BFt/} = (pi - po/BF[/) x {/ x BFf/ + (1 - /)}. 

Due to the linearity of the risk difference, we can also obtain the lower bound of the 
marginal risk differences averaged over the observed covariates C using RD^'^I^^P(C 

c\E = 1),RD^™"_ = E,RD^™^_P(C = c I £ = 0) and = E, RD|;^"|^P(C = c). In the 

eAppendix, we provide details and proofs for the results above, discuss statistical inference 
for the causal risk difference bounds under finite samples, and give formulas for how large 
the bounding factor would have to be to reduce an estimate or a confidence interval to 0 or to 
some other specified quantity. In the eAppendix, we also provide software code to implement 
this sensitivity analysis approach for the risk difference. 

Appendix 3: Conditions for the Risk Difference Using Sensi¬ 
tivity Parameters on the Risk Difference Scale 

In the previous Appendix, we considered sensitivity analysis for the risk difference with 
sensitivity analysis parameters on the relative risk scale. In this Appendix, we consider sen- 
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sitivity analysis for the risk difference with parameters defined on the risk difference scale. 
Unfortunately, for the reasons described below, the results for the risk difference with param¬ 
eters defined on the difference scale are not as practically useful as when the parameters are 
defined on the relative risk scale. 

Let = P(D =1 I £■ = 1) — P(D = 1 | £ = 0) denote the observed risk difference, 
and 

K-\ 

£{P(D=1 \ E = l,U = k)-V{D=\ \ E = Q,U = k)}V{U = k) 

k=0 

denote the standardized risk difference. 

Define at = P(t/ = k \ E = 1) — F{U = k \ E = 0) as the difference in the probability 
that the confounder U takes a particular value k comparing exposed and unexposed. We use 
RD^f/ = max/t>i \ 0Ck\, the maximum of these absolute differences, to measure the exposure- 
confounder association on the risk difference scale, defined as the maximal risk difference 
of the exposure E on the confounder U. Define = P(D = \ \ E = \,U = k) — P(D = 1 | 
E = l^U = 0) and = P(D = 1 | £ = 0,t/ = k) -P(D = 1 | £ = 0,t/ = 0) as the differ¬ 
ence in the probability of the outcome comparing the category k and 0 of the confounder U 
with and without exposure. Define RD(;£,|£=i = maxy(;>i |j8^| and RD[;£)|£=o = maxyt>i \l5k\ 
as the maximums of these differences with and without exposure, respectively. We use 
RD[/£) = max(RD[/o|£=i, RD(;£)|£'=o) to measure the confounder-outcome association in the 
risk difference scale, defined as the maximal risk difference of the confounder U on the out¬ 
come D. 

We first consider a binary unmeasured confounder. For binary confounder U, the maximal 
risk difference RD^/y becomes the ordinary risk difference RD^f/, and the maximal risk dif- 
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ference becomes the maximum of two conditional risk difference RDt/o = max(RD[/£)|£=i, RD[/£)|£=o). 
We have that 

RDeu X RDud > RD^*^^ - RD^", 
which further leads to the following low and high thresholds: 

min(RD£i/,RDi/o) > RD^^-RDg^", max(RD£t;,RDt;z)) > ^/rD^^^RD^, 

which generalize previous results under the null of zero causal effect of E on Z)iid5,i6 

For categorical confounder U, no simple form of the bounding factor is available, but we 
can still show that RD^j/ and RDf//) must satisfy the following conditions: 

RD^f; > (RDf^-RD^"^«)/(i^-l), 

RDud > (RD^'^^ - RD^™)/2, 

max(RD£t/,RD(;z)) > max | {RDf^ - RDf,^)/{K - 1), {RDf^ - RD^™«)/2|. 

When K = 3 such as a three-level genetic confounder, these conditions reduce to 

min(RD£(/, RDt/o) > {RDf^ - RD^"^")/2, max(RD£t/, RDt/o) > (7) 

The results above generalize previous results from the null hypothesis of no effect 
(RD^ = 0) to alternative hypotheses (RD^ arbitrary). We show the proofs and exten¬ 
sions for the above results in the eAppendix. 

We can see from above that the generalized Cornfield conditions for the risk difference 
under alternative hypotheses depend on the number of categories of U, and become less 
informative as the number of categories increases. Therefore, a binary confounder is not the 
most conservative case for sensitivity analysis with parameters expressed the risk difference 
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scale. However, the Cornfield conditions for the relative risk do not suffer from this problem. 
Therefore, it seems that it is more appropriate to conduct sensitivity analysis with parameters 
expressed on the risk ratio scale, and a binary confounder is the most conservative case for 
sensitivity analysis with parameters expressed on the risk ratio scale 
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Supplementary Materials for “Sensitivity 
Analysis Without Assumptions” 

The eAppendix contains the following nine sections: 

Appendix 1: Three useful lemmas which are used repeatedly in the proofs in later sections; 

Appendix 2: The new bounding factor introduced in the main text and its implied Cornfield 
conditions with proofs; 

Appendix 3: Another bounding factor with the exposure-confounder relationship on the odds 
ratio scale and its implied Cornfield conditions with proofs; 

Appendix 4: Relations between the new bounding factor and some existing results including 
Schlesselman’s formula^ and Flanders and Khoury’s results 

Appendix 5: Results for the risk difference using sensitivity parameters on the relative risk 
scale with proofs; 

Appendix 6: SAS code for the risk difference using sensitivity parameters on the relative risk 
scale; 

Appendix 7: Results for the risk difference using sensitivity parameters on the risk difference 
scale with proofs; 

Appendix 8: A bounding factor for rare time-to-event outcome on the hazard ratio scale and 
its implied Cornfield conditions; 

Appendix 9: A bounding factor for general nonnegative outcomes. 
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Appendix 1 Useful Lemmas 


Lemma A.l. Define h{x) = {cix+ 1)/ {c 2 X+ 1). When ci > C 2 , h\x) > 0 and h{x) is increas¬ 
ing; when Cl < C 2 , h\x) < 0 and h{x) is non-increasing. 

Proof of Lemma A.l. The first derivative of h{x) is 

. _ ci(c2.r+l)-(ci.x:+l)c2 _ ci -C2 

(c2a:+l)2 (c2^+1)2' 

When Cl > C2,h'{x) >0 and h{x) is increasing in ;c. When ci < C 2 , we have opposite results. 

□ 

Lemma A.2. When x^y> 1, h{x,y) = (xy)/{x-\-y — 1) is increasing in both x and y. 

Proof of Lemma A.2. The first partial derivative of h{x^y) with respect to x is 

dh{x,y) ^ y{x + y-\)-xy ^ y(y- 1) 
dx {x-\-y—lfi {x-\-y—lfi 

When x^y > 1, dh{x^y)/dx > 0 and h{x^y) is increasing in x. By symmetry, the conclusion 
holds also for y. □ 

Lemma A.3. When .r,y > 1, h{x,y) = {y/xy-\- 1)/(a/t + y/y) is increasing in both x and y. 

Proof of Lemma A.3. The first partial derivative of h{x^y) with respect to x is 

dh{x,y) _ \\/yJx{^ + ^)-\{^+\)/^ _ y- 1 

dx (v^+v^)^ 2v^(v^+y^)2' 

When x^y > 1, dh{x,y)/dx > 0 and h{x,y) is increasing in a:. By symmetry, the conclusion 

holds also for y. □ 
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Appendix 2 The New Bounding Factor and Implied Corn¬ 
field Conditions 

Appendix 2.1 Technical Measure-Theoretical Details 

This subsection presents the technical framework for the proofs. A less technical reader 
can skip this subsection and move directly to the next subsection Appendix 2.2 on the new 
bounding factor. Throughout the eAppendix, we allow the unmeasured confounder U to 
take arbitrary values, which is a measurable mapping from probability space (f2,,^,P) to a 
measurable space (T, ^). For V G ^, we define Fi (V) = P(t/ G V | £ = 1) as the distribution 
of U with exposure, Fb(V) = P(t/ G V | £ = 0) as the distribution of U without exposure, and 
F(V) = P(t/ G V) as the marginal distribution of U. The distributions Fi(-),Fo(-) and F(-) 
are measurable mappings from T to [0,1], which correspondingly induce three probability 
measures on the measurable space (T, ^). When the confounder t/ is a scalar on the real 
line, these definitions reduce to F\ (u) = F{U < m | F = 1), the cumulative distribution function 
(CDF) of U with exposure, Fo{u) = F{U < M I F = 0), the CDF of U without exposure, 
and F{u) = F{U < u), its marginal CDF. Correspondingly, the CDFs, Fi,Fo, and F, also 
induce three measures on the real line. In the following, we assume that the measure Fi is 
absolutely continuous with respect to the measure Fq, with the Radon-Nikodym derivative 
defined as RR£t/ (u) = Fi (du)/FQ{du), which is the generalized relative risk of F on t/ at t/ = 
u. The absolute continuous assumption about Fi and Fq holds automatically for categorical 
and absolutely continuous unmeasured confounder U. For general confounder U, this is only 
a mild regularity condition. 
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Appendix 2.2 The New Bounding Factor 


We assume for the next several sections that analysis is done conditional on, or within strata 
of the measured confounders C. We define the maximal relative risk of £ on t/ as RR^t/ = 
max„RR£(/( m). Define r{u) = F{D = I \ E = O^U = u) and r*( m) = P(D = 1 \ E = l^U = u) 
as the probabilities of the outcome within stratum U = u without and with exposure. Define 
the maximal relative risk of U on D as RRt/D|£=o = niax„ r{u) / min„ r{u) and RR{/£)|£=i = 
max„r*(M)/min„r*(M) without and with exposure, and RR(/£) = max(RR [/£)|£=05 RRt/z)|£=i) 
as the maximum of these two relative risks. The maxima and minima are taken over the space 
T, and hereinafter. When U is a categorical confounder with levels 0,1,... — 1, the defini¬ 

tions above reduce to the definitions in the main text. To allow for causal interpretations, we 
invoke the counterfactural or potential outcomes framework, with D, (l) and D, (0) being the 
potential outcomes for individual i with and without the exposure, respectively; we also need 
to make the ignorability assumption£'Ji{D(l),D(0)} | U. 

The observed relative risk of the exposure E on the outcome D is 

^ SV(D^l\E^l,U = u)Fi(du) ^ f r-(u)F,(d„) 

™ /P(D= 1 |£' = 0,(; = ><)Fo(dn) fr{u)F(,(du) ’ 

where the integrals are over T and hereinafter. The relative risks standardized by the exposed, 
the unexposed, and the whole population are as follows: 

true _ fr*(u)Ei(du) true _ f r* (u)Eo(du) true, f r* (u)E (du) 

fr(u)Ei(du) ’ fr(u)Eo(du) ’ fr(u)E(du) ' 

When unmeasure confounder U is categorical, RR^'q reduces to the form in the main text, 
and all other relative risk measures can be simplifies by replacing integrations by summations. 
The corresponding confounding relative risks standardized by the exposed, the unexposed. 
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and the whole population are 


TRR RRed SMFiidu) RRm 

RRl£g+ Sr[u)Fa(du)’ RR“'_ j V(u)Fa(du)’ 

and CRR^/) = RR^J^/RR^o- Similar to Lee^®, we have that RR^ is a weighted average 

of RR^'^^ and RR^^^, and CRR^o is a harmonic average of CRR££)+ and CRR^/) . 

Proposition A.l. We have 


= wRR|^^+ + (1 - w)RR|^‘^__, 1 /CRR£o = w/CRR£,,+ + (1 - w)/CRR£z)-, 


where f = P{E = 1) and w is a weight between zero and one: 


_ f jr{u)Fi{du) _ 

f jr{u)Fi{du) + - f) j r{u)FQ{du) 


e[0,l]. 


Proof of Proposition A.l. The conclusions follow from the following decomposition: 

true ^ Jr*{u)F{du) _ fJr*{u)Fi{du) + {l-f)Jr*{u)Fo{du) 
fr{u)F{du) f f r{u)Fi{du) + {I — f) f r{u)Fo{du) 
f J r{u)Fi{du) fr*{u)F\{du) 

f j r{u)Fi{du) + {\-f) j r{u)FQ{du) ^ f r{u)Fo{du) 

_ - f) Ir{u)Fo{du) _^ Jr*{u)Fo{du) 

f J r{u)Fi{du) + {I - f) J r{u)Fo{du) J r{u)Fo{du) ' 


□ 


The confounding relative risks can be bounded from above by the bounding factor 


BFt/ 


RR £(7 X RR(/D 
RReu + RRt/o — 1 ’ 


as shown in the following proposition. 


Proposition A.l. The confounding relative risks can be bounded from above by 


CRR££)+ = 


RR 


obs 

ED 


D D true 


< BF(/, CRR^d = 


RR 


r) r) true 


obs p p obs 

<BFt;, CRRed = ^z^,<RPu. 


D D true 

^^ED 
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Proof of Proposition A.2. In the following proof, we first diseuss CRR^/)^. The key obser¬ 
vation is to write CRR£d+ in terms of a binary eonfounder with two levels eorresponding to 
max„r(M) and mmur(u). To be more speeifie, we have that 


CRR£o+ 


wimaxur{u) -f {I-wi)mmur{u) 
Wo max„ r(M) -f (1 - wq) min„ r{u) ’ 


where 


/ {max„ r{u) — r{u) }Fi (du) 
max„r(M)-min„r(M) 

J{maxur{u) - r{u)}Fo{du) 
max„ r(M)-min„ r(M) 

Define F = wi/wo, and we have 

^ _ wi _ /{r(M)-min„r(M)}Fi(jM) _ J{r{u) -mmur{u)}RREuiu)Fo{du) 

Wo J{r{u)-mmur{u)}FQ{du) J{r{u)-mmur{u)}Fo{du) 

maxMRR£{/(M) x J{r{u) - mmur{u)}Fo{du) 

~ f {r{u)-mmur{u)}Fo{du) 

We ean write wo = wi/F, and therefore we have 

^ {max^r(M)-min„r(M)} xwi-hmin^r(M) 

{max„r(M)-min„r(M)}/r x wi-fmini,r(M) ' 

In the following, we divide our diseussion into two oases. If F > 1, then CRR^^ is inereasing 
in wi aooording to Lemma A.l, and it attains the maximum at wi = 1. Thus we have 

CRR+ ^ ^ ^ RR (/ Z )|£=0 ^ RR £(/ X RR ) 7 Z )|£=0 

F + RR {;£)|£=0 ~ 1 RReU + RRf /£)|£'=0 “ 1 ’ 
where the seeond inequality follows from Lemma A.l. If F < 1, then CRR^^ is non- 
inereasing in wi, and it attains the maximum at wi =0. Thus we have 


wi = 


Wo = 


/ {r{u) - min„ r{u) }Fi (du) 
maxur{u)-mmur{u) 

J{r(u) - mini, r(u)}Fo(du) 
maxu r(u) - mini, r(u) 


1 — wi 

1 - Wo 


CRR+o < 1 < 


RReU X RRf/£)|£=0 
RReu + RR(/z>|£'=o “ 1 ’ 
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where the the second inequality again follows from Lemma A.2. 


The same discussion applies to CRR^.^, and we can obtain that 
QYiR < RRgt/ ^ RRt/p|£=i 

RR£(/ + RR{7£)|£=1 “ 1 

Using the fact l/CRR^o = w/CRR^^ + (1 — w)/CRR£^, we know that 

1 ^ / RRff/ X RR(/£) \ ^ 

crRed ~ vRRfif/ + RRf/£) ~ 1 / 

and the conclusion follows. □ 

Appendix 2.3 The Implied Cornfield Conditions 


Proposition A.2 says that the bounding factor is larger than or equal to all the confounding 
relative risks. It can be viewed as the Cornfield condition for the joint value of (RR^j/, RRf//)) 
in order to reduce the observed relative risk of RR^^ to the causal relative risk of RR^'^. If 
we specify one of the unmeasured confounding measure, for example RR^t/, then we can 
solve A.2 and obtain the lower bound of the other confounding measure: 

RReu X RRpn - RRpn 

RRud > —— -F 

- RREuxRRfj^-RRfl 

When RR^ = 1, the above lower bound reduces to 


RRud > 


RREuxRRf^-RRfl 

RREu-RRf^ 


Further, Proposition A.2 implies the following Cornfield-type conditions for RRgjy and RR^d- 


Proposition A.3. We have the following Cornfield conditions: 

min(RR£(/,RR[/D) >CRR££), max(RR£(/,RR(/£)) > CRR^o4- a/CRR ed{CRRed — 1) 
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Proof of Proposition A.3. According to Lemma A.2, the right-hand side of the last inequality 
in Proposition A. 2 is increasing in both RRjyo and RR^t/. Therefore, the right-hand side of 
the above inequality in Proposition A. 2 will increase if we let RRiyz) or RR^t/ go to large 
extremes. Let RR^//) —> oq, and we have CRR^/) < RR^j/. Let RR^t/ —°o, and we have 
CRRed < RR[/£). Therefore, we have the following low threshold: min(RR[/o,RR£'t/) > 
CRR££). We can obtain the following inequality by replacing RRj//) and RR^t; in the bound¬ 
ing factor by their maximum value due to Lemma A.2: 

ppp / max2(RRt/o,RR£t/) 

^''-2max(RRt/z),RR£(/)-l’ 

solving max(RR{/£), RR^f/) from which we can obtain the following high threshold. □ 

Appendix 2.4 Preventive Exposures 

The bounding factor in Proposition A.2 is particularly useful for an apparently causative 
exposure with RR^^ > 1’ the true causal relative risk is an attenuation of RR^^ by the 
bounding factor. However, for apparently preventive exposure with RR^^ < we can derive 
equally useful bias formula. For apparently preventive exposure, we modify the definition 
of the relative risk between E and U as RR^t/ = maXuRRfyiu) = 1/min„RR£(/(M), and 
obtain the following analogous result. 

Proposition A.4. For apparently preventive exposure, we have RR^p/RR^p < BF(/. Or, 
equivalently, the true causal relative risk is an inflation o/RR^p by the bounding factor. 

Proof of Proposition A.4. Define E = I— E, and the exposure E is apparently preventive for 
the outcome. Therefore, Proposition A.2 implies that 

^ ^ RRt/z) 

RRf^ - RR£t/ + RRf/fl-l' 
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Since RR^/) = 1 /RR^^, RR|'^ = 1 /RR^> and RR^^/ = max„ RR^Ij{u) = RR^t/, the con¬ 
clusion follows. □ 

Appendix 2.5 Averaged Over Observed Covariates 

All the results above are within strata of observed covariates C. The probabilities are con¬ 
ditional probabilities (e.g., P(D = \\E = 1,C = c),P{D(l) = \ \E = 0,C = c},etc.), the 
causal relative risks are conditional causal measures (e.g., RD^_i_ = P{T)( 1) = 1 | £ = 1, C = 
c}/P{D(0) = 1 I £ = 0,C = c},etc.), and the bounding factor is also conditional denoted as 
BF(/|c = RRfiiyic X RR(/£)|c/(RR£(/|c + RR(7D|c “ !)• 


We have the following decomposition: 

^ /P(D= l\E = \,C = c,U = u)Ecu{dcdu) 

~ jV{D=\\E = Q,C = c,U = u)Ecu{dcdu) 

ffP(D=l j E = 1,C = c,U = u)E[jic(du)Ec(dc) 
ffP(D=l jE = 0,C = c,U = u)Efjic(du)Ec(dc) 
fP{D(l) = UC = c}Ec(dc) 
fP{D(0) = 1 jC = c}Ec(dc) 

/RR^5^P{D(0) = l|C = c}Fc(Jc) 

/P{Z)(0) = 1 I C = c}Ec(dc) 


Applying the result about conditional causal relative risk, we have 


R^j^obs 

true /^PWO) = l|C = c}Fc(Jc) 

RR£d > 


/P{D(0) = 1 jC = c}Ec(dc) 


> min 


p p obs 


c BF 


U\c 


If we assume a common causal relative risk RR^'^I^ = RR^'d > then we can sharpen the result 


as: 


T> T> obs 
Iviv 

RR^:^^>max-^. 

^ BFi7|c 
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Appendix 3 Another Bounding Factor and Implied Corn¬ 
field Conditions Using the Odds Ratio 

Appendix 3.1 Another Bounding Factor Using the Odds Ratio 


Define p{u) = ?(£■ = 1 | [/ = m) as the probability of the exposure, and q{u) = p{u)/{l — 
p{u) } as the odds of the exposure within level u of the confounder U. Let OReu = niax„ ^(m) / min^ ^(m) 
be the ratio of the maximum and minimum of these odds. We use OR^f/ to measure the as¬ 
sociation between the confounder U and the exposure E, which is defined as the maximal 
odds ratio between the exposure E and the confounder U. When the confounder U is binary, 
it reduces to the ordinary odds ratio. LFsing the odds ratio between the exposure E and U 
and the relative risk of the confounder U on the outcome D as the association measure as 
Bross and Lee^’^’^®, we have the following bounding factor that ties CRR^o with OReu and 
RRud' 


Proposition A.5. We have 


VOReuRRud + 1 \ ^ ^ RRf^ 
^/OR^ + VRR(7Z) / RRed 


CRRed. 


(A.l) 


Proof of Proposition A.5. Lee obtained the following results: 

CRR+ < i V^^euRRud\e=o+'^ ^ ^ f v/OR£t/RRt/p|£=i + 1 ^ 

Y VORff/ -I- A/RRt/o|£=o / ed y y/ORgjj + iyRR(/£)|£'=i j 


2 

(A.2) 


Since RRud = 

CRR+„ 


max(RR (;£)|£=05 RR(/Z)|£=i)’ Lemma A.3 implies that 

^ f VOReuRRud + i Y prr- ^ f VOReuRRud + 1 ] 
- Vv/OR^+v/RR^y ’ Vv/OR^+v/RR^y 


2 


which leads to 

1 _ w 1—w ^ OR^f/ -I- \/RRud \ ^ 

CR^“ crrJ^^crr^-VT^R ffjRRra+Ty ’ 

and the conclusion follows. □ 
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Appendix 3.2 Implied Cornfield Conditions 


The bounding factor in the last subsection implies the following Cornfield conditions: 

Proposition A.6. We have 

min(OR£t/,RR(/£)) > CRRfz), max(OR£'(/,RRt/£)) > ^a/CR^^+■ 

Proof of Proposition A.6. According to Lemma A.3, we can let RR^/j goes to infinity, and 
obtain OR^t/ > CRR^z). Similarly, we can let OReu goes to infinity, and obtain RRf/z) > 
CRR£:£). Combining them together, we have the following low threshold: min(OR£(/, RRfyo) > 
CRR£:£). According to Lemma A.3 again, we can replace OR^j; and RRf/z) by max(OR£{/, RRt/o) 
in the bounding factor in Section Appendix 3.1 , and preserve the inequality as follows: 

y2A/max(OR£t/,RR(7£,) J 

Solving the above inequality, we obtain A/max(OR£(/, RR^//)) > V CRR££) + VCRR££) — 1, 
and the high threshold follows. □ 

Propositions A.5 and A.6 generalize the results of Bross^’^ and Lee^® from only being 
applicable under the null hypothesis of no effect (i.e., only being useful for assessing how 
much unmeasured confounding would suffice to completely explain away an effect estimate) 
to alternative hypotheses and sensitivity analysis. 

Appendix 3.3 Preventive Exposure 

For apparently preventive exposure with RR^^ < 1, we can derive bias formula similar to 
Proposition A.5, and we don’t even need to modify the definition of OReu- 
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Proposition A.7. For apparently preventive exposure, we have 

RRTd ^ / VOR£t/RR^z) + i y 
RR£^ “ Vv/O^+v/R^/ 

Proof of Proposition A.7. Define E = \—E. Applying Proposition A.5, we have 

RR^z? ^ ( a/QR gz/RRf/g + ^ ^ 

Since RR^^ = l/RR|’g,RR|^® = 1/RR^®, and 

max„l/<2'(M) \/mmuq{u) maXuq{u) 

~ ^ TT~r~\ ~ w T~\ ~ ^ TT ~ ? 

mmul/q{u) l/max„<5r(M) mmuq{u) 

the conclusion follows. □ 


Appendix 4 Relations with Existing Results 

Appendix 4.1 Schlesselman’s Formula 

For a binary confounder U, Schlesselman^ first obtained that 

RRf^ 1 + iRRuD\E=l - mu = 1 I £ = 1) 

RRg“_ 1 + (RRf;o|£=o - l)P(t/ = 1 I £ = 0 ) • 

He further assumed a common relative risk of the exposure E on the outcome D within both 
U = 0 and U = I, and also a common relative risk of the confounder U on the outcome D 
within both E = 0 and E = I, denoted by 7 . Under the above no-interaction assumption, 
Schlesselman simplified the above identity to the following formula: 

RRfiB _ 1 + (r- = 1 1 ^ = 1 ) 

RR^® “ l + ( 7 -l)P(t/ = 1 |£=0)' 

We can write P(t/ = 1 | £ = 0) = P(t/ = 1 | £ = 1)/RR£(r/ and then maximize the right-hand 

side of the above formula over P(t/ = 1 | £ = 1), which gives us the following inequality: 

RRgg ^ RReu X 7 
RR““ - RReu + Y-U 


40 



The inequality above is the same as our main result in the main text, but is derived under 
unneeessary assumptions. Our result is mueh more general than the previous result obtained 
by Sehlesselman^, and his assumptions are not neeessary for deriving our new bounding 
faetor. 

Appendix 4.2 Flanders and Khoury’s results 

Flanders and Khoury used slightly different notation for eategorieal eonfounder U : 


¥{U = k\ 

£ = l)/p([/ = 0 

\E = \) 

V{U = k\ 
P(D= 1 

£ = 0)/P(t/ = 0 
\U = k,E = 0) 

£ = 0) ’ 

P(D = 1 

\u = o,E = oy 


Pk = V{U = k\E = 0), 

OR, = 

RR, = 

They expressed the eonfounding relative risk for the exposed population as 

^ L, RR,QRfcPfc 
(Ifc OR,p,)(L,RR,p,)' 

The above sensitivity analysis formula depends on a large number of sensitivity parame¬ 
ters, and requires speeifying the prevalenee of the unmeasured eonfounder among unexposed 
population. Flanders and Khoury simplified it for binary eonfounder. However, for general 
eategorieal eonfounder, they derived the following bounds on the eonfounding relative risk: 

r-DD ^ • fmax,OR, max,RR, dd ^ ^ \ 

CRRed+ < mm ——-, ——-,maxOR,,maxRR,, —,- ^ , 

[LkORkPk Lk^^kPk k k Pk* Pk** \ 

where k* and k** are the strata eorresponding to the largest OR, and RR,, respeetively. The 

upper bound depends on the prevalenee of t/. If we do not have any knowledge about the 
number of eategories or the prevalenee of U, the above bound reduees to 


CRR££)+ < min 


< max OR,, max RR, 

1^ fe fe 


5 


whieh is essentially the low threshold Cornfield eondition. 
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Appendix 5 Results for the Risk Difference Using Sensitiv¬ 
ity Parameters on the Relative Risk Scale 

Appendix 5.1 Lower Bounds for the Causal Risk Differences 

Define p\=V{D =\\E =\) and pq = V{D =\ | £ = 0) as the probabilities of the outcome 
with and without exposure, and / = F{E = 1) as the prevalence of the exposure. The causal 
risk differences for the exposed, unexposed and the whole population are defined as 

= P{D(1) = 1|£ = 1}-P{D(0) = 1|£ = 1}=P1-P{D(0) = 1|£ = 1}, 

RDf^ = P{D(1) = 1 |£ = 0}-P{D(0) = 1 |£ = 0} = P{D(1) = 1 |£ = 0}-po, 

RDfj^ = P{D(1) = 1}-P{D(0) = 1}. 

If U suffices to control the confounding between the exposure and the outcome, then the 
following standardized risk differences are the causal risk differences for the exposed, unex¬ 
posed and the whole population: 

RD'^j^_^_ = Pi- J r{u)Ei{du), RD|'^_ = J r*{u)Eo{du) - po, RD|^ = I {r*{u)-r{u)}E{du). 

Proposition A.8. The lower bounds for the causal risk differences are 

RDfjf^>p,-pQxRVu, 

RD^X>Pi/BFt/-po, 

> {Pi-Po X BFu) X {/+ (1 -/)/BFi/} = (pi/BFu-po) x {/ x BFf; + (1 -/)}. 

Proof of Proposition A.8. From the data, we can identify: 

p\ = j F{D = \\E = \^U = u)P\{du) = J r*{u)Pi{du)j 
Po = J F{D = I \ E = 0,U = u)Po{du) = jr{u)EQ{du). 
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However, the following two counterfactual probabilities are not identifiable: 


P{D(1) = 1 |£ = 0} = If{D=1\E = l,U = u)Fo{du) = J r*{u)Fo{du), 
P{D(0) = 1 |£ = 1} = lF{D=l\E = 0,U = u)Fi{du) = lr{u)Fi{du). 


First, we have 


Pi 


f r*(u)Fi(du) 
P{D(1) = 1 |£ = 0} ^ Jr*{u)Fo{du) 


= CRR££)_ < BF(/ 


according to Proposition A.2, and thus P{D(1) = 1 | £ = 0} > pi/BF(/. Second, we have 


P{D(0) = 1 I £ = 1} f r(u)Fi (du) 


Po 


f r(u)Fo(du) 


= CRR£d+ < BF(/ 


according to Proposition A.2, and thus P{D(0) = 1 |£' = l}<poX BF^/. Therefore, the lower 
bound for RD£'^_i_ is RD^_i_ > pi —po x BFf/, and for RD^q_ is RD^__ > pi/BFu —po- 


We can obtain the lower bound for RD^ using RD^ = /RD^_i_ + (1 — /)RD^_. 


□ 


If the probability of £ = 1, /, is unknown, the above result about RD^^ is not directly 
useful. In the following, we obtain a lower bound for RD^^ based on RD^^ = /RD^_i_ + 
(1 — /)RD^_, which does not depend on /. 


Proposition A.9. We have RD^^ > min(pi — pqX BF(/,pi/BF(/ —po)- When pi > pQ and 
1 < BF{/ < RR^^, the above lower bound reduces to RD^^® >Pi — PoxBFj/. 

The above results are particularly useful for an apparently causative exposure with RD^*^ > 
0, which give (possibly positive) lower bounds for the causal risk differences. However, for 
an apparently preventive exposure with RD^^ < 0, we need to modify the definition of RR^t; 
as RR£(/ = max„RR£^(M). And we have the following analogous results. 
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Proposition A. 10. For apparently preventive exposure with < 0, we have 


RD^i'^J^<pixBFu-po, 

RD^X<i^i-WBFf;, 

RD^ed < (Pi X BFu-po) X {f+{l-f)/BFu} = {pi-po/BFu) x {/ x BFf; + (1 -/)}. 

When f is unknown and 1 < BF[/ < l/RR^^, we have RR^^ <i?i—i?o/BF(/. 

Proof of Proposition A. 10. Define E = \—E. Applying Proposition A.8, we have 

RR>fD+ > P(D= 1 |£ = 1)-P(D= 1 |£ = 0) xBFf/, 

RF>f^ > P(D= 1 I £ = l)/BFf/-P(D= 1 I £ = 0), 

RDf^ > {P(D=l|£ = l)-P(D=l|£=0)xBF(/}x{/+(l-/)/BFt/} 

= {P(D = 1 |£ = l)/BFt/-P(D = 1 |£ = 0)}x{/xBF(/+ (!-/)}. 

Since RDf^^ = -RDf^^,RDf^ = -RDfj^ and RDf^ = -RDfjf, the first three 
conclusions follow. When / is unknown and 1 < BF(/ < l/RR^]^, we have RD^'^ < 
max(RD^;^%,RD^“_) =pi -po/BFu- 

□ 

The above discussion is within strata of observed covariates C. All probabilities are 
essentially conditional probabilities, e.g., P(D = 1 | £ = 1,C = c),P(£’ = 1 | C = c),etc. 
Consequently, the bounding factor and causal risk differences are also conditional, denoted 
as BF{y|g,RD^'^l^^,RD^'^l^_ and RD^'^I^. Due to the linearity of the risk difference, i.e., 

= L, RD“y.^P(C = c I £ = 1). RD™_ = Y.C RDrD|.-P(C = c | E = 0) and RD“ = 
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£^.RD^I^P(C = c), we have the following results about the marginal risk differences: 

RD^®+ > £{P(Z)=1 |£ = l,C = c)-P(D=l |£ = 0,C = c)xBF(;|JP(C = c|£ = l), 

C 

> £{P(Z)=1 |£ = l,C = c)/BF(;|^-P(D=l |£ = 0,C = c)}P(C = c|£ = 0), 

C 

> /£{P(D=1 |£ = l,C = c)-P(D=l |£ = 0,C = c)xBFt/|JP(C = c|£ = l) 

C 

+ {'^-f)'£{P{D=l\E = l,C = c)/BFu^,-P{D=l\E = 0,C = c)}P{C = c\E = 0). 

C 

Appendix 5.2 Statistical Inference for the Causal Risk Differences 

In previous subsections we discussed the population quantities assuming that we knew the 
distribution of (E^D^ C). In this subsection, we will discuss the finite sample inference for the 
causal risk differences. We can straightforwardly estimate /,pi and pQ by sample frequencies 
/,pi and po with standard errors 5,5i and sq, respectively. Then we can estimate the lower 
bound for RD^_i_ by pi — po x BF(/ with standard error +5 q x BF^)^/^, estimate the 
lower bound for RD^_ by pi/BPjj — po with standard error /BF^ + 5 q) and estimate 
the lower bound for RD^'^ by {pi —pQ x BF(/) x {/+ (1 —/)/BF{/} or {pi/PPy — Po) x 
{/ xBF(/ + (l — /)} with standard error 

^ (5? + 5oXBFl/) +(Pl-P0XBFt/)2(l-BFf^l)252, 

using a standard argument of the delta-method. After obtaining the point estimates and their 
standard errors, we can construct confidence intervals for these causal risk differences. 

Note that even without estimating the prevalence, /, of the exposure, if the exposure is 
apparently causative, we can use the lower bound of min(RD£'^_i_, RD£q_) as a lower bound 
for RD^'^. The point estimate of the causal risk difference averaged over the observed co¬ 
variates can be obtained by the weighted average of the point estimates of the causal risk 
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differences within strata of C with the proportions of the strata as the weights, and the corre¬ 
sponding sampling variance is the weighted average of the sampling variances within strata 
with the squared proportions of the strata as the weights. 

Appendix 5.3 Implied Cornfield Conditions 

The results in Proposition A.8 imply the following Cornfield conditions. 


Proposition A.ll. For an unmeasured confounder to reduce the observed risk difference to 
be and RDg'jf respectively, the joint Cornfield conditions are 


BF(;>(pi-RD^^^%)/po, 

BFf;>pi/(po + RD^X), 

^ V{RD^-^ +Po(l -/) -Pi/}2 + 4pipo/(l -/) - + po(l -/) - Pi/} 


Proof of Proposition A.ll. It is straightforward to see that the first two conclusions of Propo¬ 
sition A.8 imply the first two inequalities. From the third conclusion of Proposition A.8, we 
have the following quadratic inequality about BF(/: 


{pof)BFl + {po( 1 - /) + - pif}BFu - Pi (1 - /) > 0. 


The corresponding equation has one negative root and the following positive root: 


BBh = 


V{RDfp +Po(l -/) -Pi/}^ + 4piPo/(l -/) - {RD^^^ + Po(l -/) - Pi f} 

^Pof 


Since BF(/ >0, the inequality has the solution BF(/ > BFT. 


□ 


Similar to the discussion in the last two sections, we can also derive the low and high 
threshold Cornfield conditions from the above joint Cornfield conditions for (RR^f/, RR(/£)). 
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If RD^_^,RD^ and RD^ are zero, all the conditions in Proposition A. 11 reduce to 
BF(/ > RR£^, the one derived from the result about the relative risk of the exposure on the 
outcome. Therefore, the formula from the risk difference is the same as that derived from the 
relative risk under the null hypothesis, but they are different under the alternative hypotheses. 

With finite sample, we can also find the smallest bounding factor that can reduce the lower 
confidence limit of the lower bound of the causal risk differences to a certain magnitude. 
We will discuss (1 — a)% confidence intervals based on asymptotic normality, and let Za = 
^>^^(1 — a/2) denote the upper a/2 quantile of the standard normal distribution (e.g., when 
a = 0.05, zo .05 = 1-96). In order to reduce the confidence interval of the risk difference on 
the exposed to cover a true causal risk difference RD^^, the bounding factor must satisfy 

Pl~Po^ BF(/ — Za\Js\-\- BFy < RD^'^^, 


which has the following solution: 


BF(/ > 


Po{P\ -RD£'d+)^ a/pq(Pi -RDgg+)^-(Po-z«'^'Q){(Pi 


•^2 2 2 


(A.3) 


In order to reduce the confidence interval of the risk difference on the unexposed to cover a 
true causal risk difference RD^'^^, the bounding factor must satisfy 


PI /BF^/ - po - Za V^T/BF^ + < RD^-, 

which has the following solution: 

gp ^ pi(po+RDg'g_)~^p^f(po+RDgg_)^-{(A)+RDgg_)^-z^4}(Pi-^«‘^'i) 


{po+RDfX \2_p p 


(A.4) 


^ED-) 

Note that if we assume = RD£'^_ = 0, the above solutions in (A.3) and (A.4) reduce 

to the same form: 

PiPo - sj p\pI - {pi - zlsl) {p\ - zls\) 


BF[/ > 


^2 ? 2 
Po-Pa4 


A1 



In order to reduce the confidence interval of the risk difference to cover a true causal risk 


difference RD^'^, the bounding factor must satisfy 


{Pi 


Po X BF(/) 



~Za 


\ 


4 X BF’i,) 



2 

+ iPl 


Po X BFf/) 2 (l 


BFf}^)^ < RD 


true 

ED^ 


(A.5) 


which can be solved numerically. For example, we can apply a grid search for the solution of 
(A.5) over the following bounded range: 


[ J {RD^® +Po{^-f)-PifV + 4^i.Po/( 1 -/) - {RD^d +Po{^-f)-Pif}\ 

^ -j’ 

since the point estimate has already been reduced to RD^'^ when BF{/ attains the above 
upper bound of range. 


Appendix 6 SAS Code for the Risk Difference Using Sensi¬ 
tivity Parameters on the Relative Risk Scale 


In this section, we provide SAS code for sensitivity analysis on the risk difference scale. The 
SAS code here illustrates analysis using logistic regression for a binary outcome as this is an 
approach that is commonly employed. 

Suppose we have a dataset named “leadlogit” with variables lead, smoking, age, male. 
Suppose we are interested in the risk difference of smoking on the high blood lead level at 
the covariate level, age = 50 and male = 1. 

To implement sensitivity analysis for risk difference we need to obtain point estimate and 
standard error for / = F{E = 1), which can be done via the following SAS code. 
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proc means data=lib.leadlogit ; /*f and se(f)*/ 
var smoking; 

output out=sumstat mean=mean var=var N=N; 

run; 

data sumstat (KEEP=MEAN SE) ; 
set sumstat; 
se= (var/N)**0.5; 
run; 


The following code obtains the predicted probabilities pi\c = P{P(1) = 1 | C = c} and 
Pq\c = (0) = 1 I C = c} with standard errors. 

proc logistic data = lib.leadlogit;/*predict probs*/ 
model lead = smoking age male; 

score data = lib.leadlogit_new out=logit_pred elm; 
run; 

proc contents data =logit_pred ; 
run; 

data logit_pred {keep=P_TRUE se_p);/*pl pO se{pl) se(pO)*/ 
set logit_pred; 

logit_LCL_TRUE =log{LCL_TRUE/{1-LCL_TRUE)); 
logit_P_TRUE =log{P_TRUE/(1-P_TRUE)); 
logit_UCL_TRUE =log(UCL_TRUE /{1-UCL_TRUE )); 

se_eta =(logit_UCL_TRUE-logit_LCL_TRUE)/2/1.96; 

se_p =P_TRUE**2/EXP{logit_P_TRUE)*se_eta; 

run; 


In the following SAS code, we need to input from line 2 to line 7 the point estimates 
and standard errors of the prevalence /, and the two predicted outcome probabilities and 
Pq\(.- The output contains lower bounds for the point estimates and confidence intervals of 
the causal risk differences for the exposed, unexposed and the whole population. Figure A.l 
is the SAS output for the causal risk difference estimates for the whole population. For other 
problems, we need to change the numbers from line 2 to line 7 accordingly. We can also 
change the measures of the strength of confounding in lines 8 and 9. The output from SAS 


will be similar to the one shown in Figure A.l. 

proc iml;/*SensitivitY analysis without assumptions for RD*/ 

f = 0.2032934132;/*point and interval estimate of prevalence and response rates*/ 


pi 

= 0.101645862; 

pO 

= 0.0398930775 

s2_f 

= 0.0069647038 

s2_pl 

= 0.0147497019 
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s2_pO = 0.0058931321; 

RR_EU = {1.2 1.3 1.5 1.8 2 2.5 3 5};/*strenghth of confounding*/ 

RR_UD = (1.2 1.3 1.5 1.8 2 2.5 3 5}; 
rownames_EU = CHAR{RR_EU, NCOL{RR_EU), 1); 
colnames_UD = CHAR{RR_UD, NCOL{RR_UD), 1); 

BiasFactor = J{NCOL{RR_EU), NCOL{RR_UD), 1); 

SPACE = J{NCOL{RR_EU), NCOL{RR_UD), " "); 

LeftP = J{NCOL(RR_EU), NCOL(RR_UD), 

Mid = J{NCOL{RR_EU), NCOL(RR_UD), 

RightP = J{NCOL{RR_EU), NCOL{RR_UD), 

/*initial values*/ 

RD_exposed = BiasFactor; 

RD_exposed_L = BiasFactor; 

RD_exposed_U = BiasFactor; 

RD_unexposed = BiasFactor; 

RD_unexposed_L = BiasFactor; 

RD_unexposed_U = BiasFactor; 

RD_whole = BiasFactor; 

RD_whole_L = BiasFactor; 

RD_whole_U = BiasFactor; 

W_whole = BiasFactor; 

Var_exposed = BiasFactor; 

Var_unexposed = BiasFactor; 

Var_whole = BiasFactor; 

/*Sensitivity analysis*/ 

DO i=l TO NCOL{RR_EU); 

Do j=l to NCOL{RR_UD); 

BiasFactor[i, j] = RR_EU[i]*RR_UD[j]/{RR_EU[i] + RR_UD[j] - 1); 

/*exposed*/ 

RD_exposed[i, j] = pi - pO*BiasFactor [i, j]; 

Var_exposed[i, j] = s2_pl + s2_p0*(BiasFactor[i, j])**2; 

RD_exposed_L[i, j] = RD_exposed[i, j] - 1.96*sqrt(Var_exposed[i, j]); 

RD_exposed_U [i, j] = RD_exposed[i, j] + 1.96*sqrt(Var_exposed[i, j]); 

/*exposed*/ 

RD_unexposed[i, j] = pl/BiasFactor[i, j] - pO; 

Var_unexposed[i, j] = s2_pl/(BiasFactor[i, j])**2 + s2_p0; 

RD_unexposed_L[i, j] = RD_unexposed[i, j] - 1.96*sqrt(Var_unexposed[i, j]); 

RD_unexposed_U[i, j] = RD_unexposed[i, j] + 1.96*sqrt(Var_unexposed[i, j]); 

/*whole*/ 

W_whole[i, j] = f + (1-f)/BiasFactor[i, j]; 

RD_whole[i, j] = RD_exposed[i, j]*W_whole[i, j]; 

Var_whole[i, j] = Var_exposed[i, j]*(W_whole[i, j])**2 

+ (RD_exposed[i, j])**2*(l-l/BiasFactor[i, j])**2 *s2_f; 

RD_whole_L[i, j] = RD_whole[i, j] - 1.96*sqrt(Var_whole[i, j]); 

RD_whole_U[i, j] = RD_whole[i, j] + 1.96*sqrt{Var_whole[i, j]); 

END; 

END; 

/*print;*/ 

RD_exposed = CATX{" ", CHAR(round{RD_exposed, 0.0001)), LeftP, CHAR(round{RD_exposed_L, 0.0001)), Mid, 

CHAR(round(RD_exposed_U, 0,0001)), RightP); 
print RD_exposed[colname = colnames_UD 
rowname = rownames_EU 

label = "Bounds on corrected estimates and confidence intervals for risk difference 
among exposed (columns correspond to increasing strength of the risk ratio of U 
on the outcome; rows correspond to increasing strength of risk ratio 
relating the exposure and U)"]; 

RD_unexposed = CATX(" ", CHAR(round(RD_unexposed, 0.0001)), LeftP, CHAR(round(RD_unexposed_L, 0.0001)), Mid, 

CHAR(round(RD_unexposed_U, 0.0001)), RightP); 
print RD_unexposed[colname = colnames_UD 
rowname = rownames_EU 

label = "Bounds on corrected estimates and confidence intervals for risk difference 
among unexposed (columns correspond to increasing strength of the risk ratio of U 
on the outcome; rows correspond to increasing strength of risk ratio 
relating the exposure and U)"]; 

RD_whole = CATX(" ", CHAR(round(RD_whole, 0.0001)), LeftP, CHAR(round(RD_whole_L, 0.0001)), Mid, 

CHAR(round(RD_whole_U, 0.0001)), RightP); 
print RD_whole[colname = colnames_UD 
rowname = rownames_EU 
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label = "Bounds on corrected estimates and confidence intervals for risk difference 
among the whole population (columns correspond to increasing strength of the risk ratio of U 
on the outcome; rows correspond to increasing strength of risk ratio 
relating the exposure and U)"]; 

run; 


Appendix 7 Results for the Risk Difference Using Sensitiv¬ 
ity Parameters on the Risk Difference Scale 

Appendix 7.1 A Useful Proposition 

We first recall some definitions in the main text, and assume a categorical unmeasured con- 
founder U. Let RDg'^ = P(D = 1 | £ = 1) — P(D = 1 | £ = 0) denote the observed risk 
difference, 

K-\ 

£{P(D=1 \ E = l,U = k)-V{D=\ \ E = Q,U = k)}V{U = k) 

k=0 

denote the true causal risk difference, and CRD^d = RD|!^ — RD^ denote the confounding 
risk difference of the exposure E on the outcome D. Define Uk = P(t/ = k\E = \) — P(t/ = 
k\E = 0) and RD^f/ = max^>i \(X]^\. Define = P(D = \\E = \^U = k) — P(D =\\E = 
l,t/ = 0) andj8^ = P(D= 1 \ E = 0,U = k)-V{D = \ \ E = 0,U = 0). Define RD(;^|£=i = 
max^>i |^^*|,RD{;£,|£=o = maxfe>i and RDud = max(RD(;^|£=i,RD{;^|£=o)- The con¬ 
founding risk difference can be decomposed as follows. 

Proposition A.12. The confounding risk difference ofE on D, CRD^/), can be expressed as 
CRD^o = RDff - RD^^“ = Uk^PiE = 0) = 1)}. 

k=l 

Proof of Proposition A. 12. The true and observed risk differences of £ on D can be expressed 
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Bounds on corrected estimates and confidence intervals for risk difference among the whole population 
(coiumns correspond to increasing strength of the risk ratio of U on the outcome; 
rows correspond to increasing strength of risk ratio relating the exposure and U) 



1.2 

1.3 

1.5 

1.8 

1.2 

0.0593 (- 0 . 2184 , 0 . 3369 ) 

0.0583 (- 0 . 2178 , 0 . 3345 ) 

0.0568 (- 0 . 217 , 0 . 3305 ) 

0.0551 (- 0.2161 , 0 . 3263 ) 

1.3 

0.0583 (- 0 . 2178 , 0 . 3345 ) 

; 0.057 (- 0.2171 , 0.3311 ) 

0.0548 (- 0 . 216 , 0 . 3257 ) 

0.0525 (- 0 . 2148 , 0 . 3199 ) 

1.5 

0.0568 (- 0 . 217 , 0 . 3305 ) 

0.0548 (- 0 . 216 , 0 . 3257 ) 

0.0517 (- 0 . 2145 , 0 . 318 ) 

0.0483 (- 0.2131 , 0 . 3098 ) 

1.8 

0.0551 (- 0.2161 , 0 . 3263 ) 

0.0525 (- 0 . 2148 , 0 . 3199 ) 

0.0483 (- 0.2131 , 0 . 3098 ) 

0.0438 (- 0 . 2116 , 0.2991 ) 

2.0 

0.0543 (- 0 . 2157 , 0 . 3242 ) 

0.0514 (- 0 . 2143 , 0 . 317 ) 

0.0466 (- 0 . 2125 , 0 . 3057 ) 

0.0414 (- 0.211 , 0 . 2939 ) 

2.5 

0.0528 (- 0 . 215 , 0 . 3205 ) 

0.0492 (- 0 . 2134 , 0 . 3119 ) 

0.0435 (- 0 . 2115 , 0 . 2986 ) 

0.0372 (- 0 . 2103 , 0 . 2847 ) 


0.0517 (- 0 . 2145 , 0 . 318 ) 

0.0478 (- 0 . 2129 , 0 . 3085 ) 

0.0414 (- 0.211 , 0 . 2939 ) 

0.0343 (- 0.2101 , 0 . 2788 ) 


0.0497 (- 0 . 2136 , 0 . 313 ) 

0.045 (- 0 . 2119 , 0 . 3019 ) 

0.0372 (- 0 . 2103 , 0 . 2847 ) 

0.0285 (- 0 . 2105 , 0 . 2675 ) 


Bounds on corrected estimates and confidence intervals for risk difference among the whole population 
(coiumns correspond to increasing strength of the risk ratio of U on the outcome; 
rows correspond to increasing strength of risk ratio relating the exposure and U) 

■ 


2.5 

3.0 

5.0 

1 

1.2 

1 

0.0543 (- 0 . 2157 , 0 . 3242 ) 

0.0528 (- 0 . 215 , 0 . 3205 ) 

1 ^ 0.0517 (- 0 . 2145 , 0 . 318 ) 

0.0497 (- 0 . 2136 , 0 . 313 ) 

1.3 

0.0514 (- 0 . 2143 , 0 . 317 ) 

0.0492 (- 0 . 2134 , 0 . 3119 ) 

0.0478 (- 0 . 2129 , 0 . 3085 ) 

0.045 (- 0 . 2119 , 0 . 3019 ) 

1.5 

0.0466 (- 0 . 2125 , 0 . 3057 ) 

0.0435 (- 0 . 2115 , 0 . 2986 ) 

0.0414 (- 0.211 , 0 . 2939 ) 

0.0372 (- 0 . 2103 , 0 . 2847 ) 

1.8 

0.0414 (- 0.211 , 0 . 2939 ) 

0.0372 (- 0 . 2103 , 0 . 2847 ) 

0.0343 (- 0.2101 , 0 . 2788 ) 

0.0285 (- 0 . 2105 , 0 . 2675 ) 

2.0 

0.0388 (- 0 . 2105 , 0.2881 ) 

0.034 (- 0.2101 , 0.2781 ) 

0.0307 (- 0 . 2102 , 0 . 2716 ) 

0.024 (- 0 . 2116 , 0 . 2595 ) 

2.5 

0.034 (- 0.2101 , 0.2781 ) 

0.028 (- 0 . 2106 , 0 . 2667 ) 

0.024 (- 0 . 2116 , 0 . 2595 ) 

0.0154 (- 0 . 216 , 0 . 2468 ) 

3.0 

0.0307 (- 0 . 2102 , 0 . 2716 ) 

0.024 (- 0 . 2116 , 0 . 2595 ) 

0.0193 (- 0 . 2136 , 0 . 2522 ) 

0.0093 (- 0 . 2212 , 0 . 2398 ) 

5.0 

0.024 (- 0 . 2116 , 0 . 2595 ) 

0.0154 (- 0 . 216 , 0 . 2468 ) 

0.0093 (- 0 . 2212 , 0 . 2398 ) 

- 0.0045 (- 0 . 2402 , 0 . 2312 ) 


Figure A.l: SAS Output of Sensitivity Analysis on the Risk Difference Scale for the Whole 
Population 
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as 

K-l 

K-l 

RDgg 

= IP(^=1 

\E = l,U = k)P{U = k) - ^ P(D = 1 1 £ = 0, U = k)P{U = k), 


k=0 

k=0 

RD* 

K-l 

K-l 

= IP(^=1 

k=0 

\E = l,U = k)P{U = k 1 £ = 1) - £ P(D = 1 1 £ = 0, t/ = k)P{U = k\E = 0) 

k=0 


Therefore, the confounding risk difference of E on D, CRD££), can be expressed as 

K-l 

CRDed = P(Z)= 1 |£ = l,t/ = k){P(t/ = k|£ = l)-P(t/ = k)} 

k=0 

K-l 

- £ p(D= 1 \ E = 0,U = k){P{U = k \ E = 0) - P{U = k)}. 

k=0 

Applying the law of total probability, we have the following results: 


P{U = k\E = l)-P{U = k) = akP{E = 0), P{U = k\ E = 0)-P{U = k) =-akP{E = 1). 


Therefore, the confounding risk difference can be rewritten as 

K-l K-l 

CRDed = £ a*P(Z) = 1 | £ = l,t/= k)P(£ = 0) + £ a^P(D = 1 | £ = 0,C/= k)P(£ = 1) 

k=0 k=0 

K-l 

= £ a^{P(D=l \ E = l,U = k)P{E = 0)+P{D=l \ E = 0,U = k)P{E = 1)}. 

k=0 

Using the fact that ccq = — Lf=/ we can obtain that 

K-l 

CRDed = £ a*{P(Z) = 1 | £ = l,t/= k)P(£ = 0)+P(D = 1 | £ = 0,t/= k)P(£ = 1 )} 

k=l 

K-l 

- £ ak{P{D=l |£ = l,t/ = 0)P(£ = 0)+P(D=l |£ = 0,t/ = 0)P(£ = 1)} 

k=i 

= J^ak{KP{E = 0)+l5kP{E = l)}n 
k=i 

Appendix 7.2 Binary Confounder 

For a binary confounder U with K = 2,we have the following proposition. 
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Proposition A. 13. When U is binary, we have RD^f/ x RD(/£) > RD|^ — RD^^®, implying 


min (RD^t;, RDt/o) > RDf^ - RD^^“^ max {RDeu.RR>ud) > y^RD|^^ - RDg“. 

Proof of Proposition A. 13. We have 

CRDed = ai{/3iiP(£ = 0) +i8oiP(£ = 1)} = RD£t/{RD(;o|£=iP(£ = 0)+ RDuD\E=onE = 1)}- 

Since CRD^^) > 0 and RD^iy > 0, we have RD^o|£=iP(£’ = 0) + RD(;£)|£=oP(£’ = 1) > 0. 

Therefore, RD(;£)|£=i and RD{;£)|£=o cannot both be negative, and thus we have 

RD(/£)|£=iP(^ = 0) + RD(/£,|£=oP(^ = 1) < max(RDt/£)|£=i,RD(/£)|£=o) = RDuo- 

Therefore, CRD^/j < RD^^/ x RDt/^), which implies that min (RD^t/, RD(/d) > CRD^/) = 

RDf^ - RD“;^^ and max {RDeu,RR>ud) > v^CRD^ = y^RDf^-RDg^^ □ 

Appendix 7.3 General Categorical Confounder 

For categorical confounder U, no simple form of the bounding factor is available, but we can 
still show that RD^t/ and RD^/^) must satisfy the following conditions: 

Proposition A. 14. For a categorical confounder U, we have 

RDeu > iRDff,-RDrD^)/{K-l), 

RDuD>iRDff,-RD%^)/2, 

max(RD£i/, RD[/z)) > max | {RDff, - /{K - 1), {RDff - . 

When K = 3 such as a three-level genetic confounder, these conditions reduce to 

min(RD£i/,RDf;z)) > {RDf^-RDf^)/ 2 , max{RD eu,RR>ud) > {RDfl-RDf^)/ 2 . 
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Proof of Proposition A. 14. Since 


K-l 


K-l 

E 

k=i 


CRDed = £ ockiKHE = 0) +/3fcP(£ = 1)} < RDeu £ IKHE = 0) +/3,P(£ = 1)1 

' k=l 

K-l 

< RD£t;£max(|4*|,|j8,|)<RD£f;(i^-l), 


we have RD^^/ > CRDed/{K — 1). The equality is attainable if and only if (cl) Ct/t = 
CRD£z)/(i^- 1), and = j8^ = 1 for = 1,..., (i^- 1); or (c2) = -1, and = j8^ = -1 
for A: = 1,... The condition (cl) requires that the risk difference of the exposure E on each 
category of U to be the same as CRD^/?/(^—1), and the confounder U is a perfect predictor 
of the disease D. Similar interpretation applies to condition (c2). 

Since 


K-l 


K-l 


K-l 


CRDed = £ a,{j8,*P(£ = 0) +/3,P(£ = 1)} < £ |a,|max(|/3,*|, |j8,|) < RD(;o £ |a^| 

' k=l k=l k=l 

K-l K-l 

< RDud £ p(t/ = k I £ = 1) + RDud £ p(t/ = k I £ = 0) < 2RD(/z), 


k=l 


k=l 


the lower bound for RDf/o is RD(/£) > CRDed/^- The equality is attainable if and only 
if P(C/ = 0 I £ = 0) = P([/ = 0 I £ = 1) = 0,P([/ = k\E = 1)P([/ = k I £ = 0) = 0 for 
k = 1, {K— 1), and /3^ = = iCRD^o/2 with the sign the same as the sign of a^. 

Since CRD££)< (^—l)RD£'jyRDt/£) < (^—l)max^(RD£f/, RD(/£>), we have max(RD£[/,RD(/£)) > 
with the equality attainable if and only if (Xk = = ^k = CRDed/{K— 1) 

for k = 1,— 1. Due to the constraint Ef=/ \ (^k\ <2 discussed above, the equality is at¬ 
tainable if and only if {K - 1) v/CRD£z)/(i^-1) < 2 or (i^ - 1 )CRD£d < 4. When {K - 
1)CRD£o > 4, RD{ 7 £) can attain its lower bound CRD^d with Lf=/ \ ^k\ = 2- Therefore, 

RD^f/ can attain its lower bound 2/(^ — 1), which, in this case, is smaller than CRD^d/ 2. In 
summary, the lower bound for max(RD£:(/, RD^/d) is max(RD£j 7 , RD^/d) > ^ CRD ED/{K-l), 
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if {K— 1)CRD£/) < 4, andmax(RD£{/,RD(/£)) > CRD^d/ 2, if {K— 1)CRD£'/) > 4. Equiv¬ 
alently, we have max(RD£t/,RD{ 7 £)) > max|A/CRD££)/(^— 1 ),CRD£d/2| . □ 

For the Cornfield eonditions for the risk differenee, sharper eonditions ean be obtain by 
imposing the monotonieity assumption that «/. > 0 for k = 1, • • • , — 1). It requires that 

eaeh non-zero eategory of U is more prevalent under exposure, whieh is naturally satisfied for 
binary eonfounder. Under the monotonieity assumption, the eonditions for the risk differenee 
ean be strengthened. 


Proposition A. 15. For a categorical eonfounder under monotonieity, we have that 


RDeu > 

RDud>R^ed-^^ed. 

maxiRD eu^RDud) > max | {RDf£ - /{K - 1), RDff - RD^-^|. 


Proof. Proof of Proposition A. 15. The bound for RD^f/ remains the same. Sinee 


CRD£^ 


K-\ 


£a,{ft*P(E = 0) + ft(E = l)} 

k=\ 


K~l 


< RD(/£) ^ \cck\ < RD(/£)(—cto) < RD(/£), 

k=i 


the lower bound for RD (je> is RD(/£) > CRD^/) The equality is attainable if and only if 
(Xq~ — I and /3^ = = CRD^o for k = 1,..., ^ — 1. The eondition requires that the presenee 

or absenee of the eonfounder U is perfeetly predietive to the exposure E, and eaeh eategory 
of U is equally predietive to the disease D. 

Since CRD^o < {K— 1)RD£(7RD(7£) < {K— 1) max^(RD£j/, RDf//?), wehavemax(RD£(/,RD(/£)) > 
y^CRDEo/iK— 1), with the equality attainable if and only if CCk = = Pk = ^^/CRDed/{K — 1) 

for k = 1,... — 1. Due to the constraint 'Lk=i Ctfc = — C^o < 1 discussed above, the equality 

is attainable if and only if {K — 1)a/CRD££)/(.^— 1) < 1 or (^— 1 )CRD£d < 1. When 
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{K— 1)CRD££) > 1, RD(/£) can attain its lower bound CRD^^) with CCk=l. Therefore, 
RD^f/ can attain its lower bound l/(^ — 1), which, in this case, is smaller than CRD^^). In 
summary, the lower bound for max(RD£:(/, RD^/d) is max(RD£t/, RD^/d) > ^CRDEo/iK-l), 
if (K— 1)CRD££) < 1, and max(RD£{/,RD(/£)) > CRDed, if (K — 1)CRD£'£) > 1. Equiva¬ 
lently, we have max(RD£'( 7 , RDt//)) > max | ^/C^jDEDJ{K^^, CRD^oj . □ 

The results in Propositions A. 12 to A. 15 generalize previous results from the null hy¬ 
pothesis of no effect (RD^'^ = 0) to alternative hypotheses (RD^'^ arbitrary). 

Appendix 8 A Bounding Factor for Rare Time-to-Event Out¬ 
come on the Hazard Ratio Scale 

Let /,5, A be the probability density, survival function and hazard function of a positive 
continuous outcome T. The outcome is rare in the sense that P(r < ^) is not much greater 
than 0, where is the time point of the end our research of interest. In the following, we 
will always make the rare outcome assumption. Although /, 5, A are defined on the whole 
positive real line, our interest only within interval [0, 3^] . Let U be another random variable, 
and fit I u)jS{t \ M),A(t | u) are the conditional probability density, survival function, and 
hazard function of T given U . The following lemma is useful throughout our discussion. 

Lemma A.4. IfT is a rare time-to-event outcome, we have the following approximation: 

A(t) ~ I I u)F{du). 

Proof of Lemma A.4. Similar to the case with discrete we have S{t | m) ~ 1 for rare 
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outcome, and therefore 


m 


m 

S(l) 


SHt\ 

u)S{t 

u)F{du) _ jX{t 

u)F{du) 

S S(I 

u)F{du) J F{du) 


J I u)F{du). 


□ 

Lemma A.4 essentially allows “Law of Total Probability” type of calculation for the haz¬ 
ard function with rare outcome. 

In order to introduce the new bounding factor for hazard ratio, we need more formal nota¬ 
tion. Define the potential outcomes for T as r(l) and T (0) with hazard functions (t) and 
X (t) and conditional hazard functions can be defined intuitively as A(t | ■) and X^^\t \ ■). 
We define X* (u) = X{t \ E = l^U = u) and A, (m) = A (t | £ = 0, t/ = u) as the conditional haz¬ 
ard functions of T for the exposed and unexposed units within strata U = u, respectively. We 
define = maXuX*{u)/minuX*{u) as the maximal hazard ratio function of the 

confounder U on the outcome T for exposed units, HR(/y|£'=o(0 = max„ Af(M)/minM Af(M) 
for unexposed, and their maximum, denoted by HR[/ 7 ’(t) =max{HR[; 7 ’|£=i(t),HR(/y|£;=o(t)}, 
as the maximal hazard ratio function of the confounder U on the outcome T. Note that the 
hazard ratios are time-dependent. 

If the exposure and the outcome are unconfounded given U and the observed covariates 
C (which is omitted in conditional probablities for simplicity). Lemma A.4 allows us to write 
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the true causal hazard ratios for the exposed, unexposed, and the whole population as 


HR^T+(0 = 


HR^T-(0 = 


URf^it) = 


A(i)(t|£ = l) ^ JX;iu)Fiidu) 
A(0)(t|£ = l) ^ fXt{u)Fi{du) ^ 
|£ = 0) _ JX;{u)Fo{du) 
A(0)(t |£ = 0) ~ JXt{u)Fo{du) ’ 
A(i)(0 JX;{u)Fidu) 

A(0)(0 ^ fXr{u)F{du) ’ 


and the observed harzard ratio as 


HRej-(<) = 


A(<|£- 1 ) ., /VMFiW 

A(t|£'=0) f Xt{u)Fo{du) 


With categorical U taking values 0,1,— 1, the true causal hazard ratios can be approxi¬ 
mated by the following standardized hazard ratios: 


HR|"A(») 


HR“'_(<) 


HRf/(<) 


i:f:t,‘vwp(t'=*|£ = i) 
Eto‘^(W = *|£ = l)’ 
LlliK(mu = k\E = 0) 
LfdMI‘)P{U = k\E = 0)’ 
ztoKmu=k) 
LtnMmU = t) 


The confounding hazard ratios are defined as 

CHR^r ft) = CHRprft) = 

Analogous to the results for the relative risk, we have the following propositions for the 
hazard ratio. The proofs are straightforward if we replace the proofs for the 

relative risk by { Af (■), A * (m) }. 


Proposition A.16. For rare time-to-event outcome, we approximately have 


\FR%f{t) = w,HR^7+(t) + (l-w,)HR^ 


1 /chrctW = wt/cnRET+{t) + {\-wt)/cnRET^{t), 
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where Wt is a weight between zero and one: 


_ f j Xt{u)Fi{du) _ 

f j Xt{u)Fi{du) + [I - f) J Xt{u)FQ{du) 


e[0,l]. 


Define the time-varying bounding faetor as 




RR£{/ X HR(77’(t) 
RR£t/ + HRt/r(t)-l’ 


whieh is also time-dependent. The eonfounding hazard ratios ean be bounded by the bound¬ 
ing faetor, as shown in the following proposition. 


Proposition A.17. For rare time-to-event outcome, we approximately have 


CHR^r+W < BF(;(0, CHRET-it) < BF(;(t), CHR^rW < BFf;(t). 


Proposition A.18. The implied Cornfield conditions for the hazard ratio from Proposition 
A.17 are 


RReu > maxCHR£7^(t), 

HRt/7-(t) > CHR£r(0, 

max{RR£t/,HRt/r(t)} > CHR£r(0 + VCHR£r(0{CHR£r(t) - !}• 

If a proportional hazards modelfor the outeome is used as is often the ease in praetiee, 
all the above exposure-outeome hazard ratio reduee to a eonstant HR£ 7 ’(t) = HR£r. The 
above diseussion works well for an exposure that is apparently eausative at time t on the 
harzard ratio seale. If at some time point t, the exposure is apparently preventive, then the 
above diseussion needs to be modified. To be more speeifie, we need to modify the definition 
of RR£(/ as in Seetion Appendix 3.3, and the eonfounding hazard ratios above are replaeed 
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by their reciprocals. Likewise similar results on the hazard difference scale hold as those on 
the risk difference scale in eAppendix A.Appendix 5 provided that the outcome is relatively 
rare. 

Appendix 9 A Bounding Factor for General Nonnegative 
Outcomes 

The discussion above assumes a binary outcome D, and in fact all the proofs only use 
the property that r{u) and r*{u) are nonnegative. Therefore, the bounding factor also ap¬ 
plies to any nonnegative outcomes (counts, continuous positive outcome, etc), if we mod¬ 
ify the definitions of r{u)^r*{u), and RRf//) in the following way. For general nonnega¬ 
tive outcomes, we define r*(u) = E(D \ E = l^U = u) and r{u) = E(D | £ = 0,t/ = m) as 
the expectations of the outcome within stratum U = u with and without exposure. Define 
MRt/£)|£=i = maxw r* (u) / mmu r* (u) and = maxu r{u) / min^ r{u) as the mean ra¬ 

tios of U on D with and without exposure, and MRt/£) = max(MR(;£,|£=i,MR{ 7 £)=o) as the 
maximum of these two mean ratios. Note that when D is binary, r{u) and r*{u) reduce to 
probabilities, and the mean ratios reduce to the relative risks. 

The observed mean ratio of the exposure on the outcome is 

_ /E(D \E = I,U = u)Ei{du) _ jr*{u)Ei{du) 

~ /E(D|£ = 0,[/ = m)Fo(Jm) “ fr{u)Eo{du) ' 

The true causal mean ratio of the exposure on the outcome for exposed is 

jEiD\E = l,U = u)Eiidu) fr*(u)Ei(du) 
fE{D\E = 0,U = u)Ei{du) fr{u)Ei{du)' 

the true causal mean ratio of the exposure on the outcome for unexposed is 

jEiD\E = l,U = u)Eoidu) Jr*{u)Eoidu) 
fE{D\E = 0,U = u)Eo{du) fr{u)Eo{du)' 
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and the true causal mean ratio of the exposure on the outcome for the whole population is 


JE{D\E=l,U^u)F{du) fr’(u)F{du) 
™+ fW.(D\E = 0 ,U ^u)F(du) fr(u)F(du)' 

Define the bounding factor as 


RRft/ X 

BFf/ =- 

RRfit/ + MRt/£) — 1 

Since the discussion in Section Appendix 2 still holds, the proofs for the following proposi¬ 
tions are the same as those in Appendices A.2 and A.4. First, we have the following bounding 
factor for nonnegative outcomes: 


Proposition A. 19. 

MRs-n MR/jn MRpn 

CMR£o+ = ^,^<BFt/, CMRed CMRed = <BFu. 


Mm 


MR" 


^ED+ 

In practice, we might also be interested in the average causal effect of the exposure on 
the outcome on the difference scale. The observed mean difference of the exposure on the 


outcome is 


E(D I £■ = 1) — E(D I £■ = 0) = mi — niQ. 


The average causal effect of the exposure on the outcome for exposed is 


ACE^% = J E{D\E = l,U = u)Fi (du) - J E{D\E = 0 ,U = u)Fi {du) =m- j r{u)Fi (du ), 

the average causal effect of the exposure on the outcome for unexposed is 

ACE^%= [E{D\E = l,U = u)Fo{du)- [E{D\E = 0 ,U = u)Fo{du)= [r*{u)Fo{du)-mo, 


J ' 


and the average causal effect of the exposure on the outcome for the whole population is 

ACE^® = jE{D\E = l,U = u)F{du)-jE{D\E = 0 ,U = u)F{du) 

= fACEf^+ + {l-f)ACEf^_. 
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Similar to the discussion in Section Appendix 5 for the risk difference with sensitivity 
parameters expressed on the risk ratio scale, we have the following proposition about the 
average causal effect. 

Proposition A.20. For nonnegative outcomes, the lower bounds for the average causal effects 
are 


ACE^^% > mi - mo X BF(/, 

ACE^X ^"^i/BFtz-mo, 

ACE^^^^ > (mi -mo X BFu) x {f + (I - f)/BFu} = (mi/BEt/ -mo) x {/ x BFu + (1 

We can also obtain similar forms of the conclusion for apparently preventive exposure, 
for average causal effects averaged over observed covariates, and for corresponding Cornfield 
conditions. The only difference is that (pi,po) is replaced by (mi,mo). 
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